Since many years ago, the scientific community is concerned about how to increase the accuracy of different classification methods, and major achievements have been made so far. Besides this issue, the increasing amount of data that is being generated every day by remote sensors raises more challenges to be overcome. In this work, a tool within the scope of InterIMAGE Cloud Platform (ICP), which is an open-source, distributed framework for automatic image interpretation, is presented. The tool, named ICP: Data Mining Package, is able to perform supervised classification procedures on huge amounts of data, usually referred as big data, on a distributed infrastructure using Hadoop MapReduce. The tool has four classification algorithms implemented, taken from WEKA’s machine learning library, namely: Decision Trees, Naïve Bayes, Random Forest and Support Vector Machines (SVM). The results of an experimental analysis using a SVM classifier on data sets of different sizes for different cluster configurations demonstrates the potential of the tool, as well as aspects that affect its performance
The amount of data generated in all fields of science is increasing extremely fast (Sagiroglu et al., 2013) (Zaslavsky et al., 2012) (Suthaharan, 2014) (Kishor, 2013). MapReduce frameworks (Dean et al., 2004), such as Hadoop (ApacheHadoop, 2014), are becoming a common and reliable choice to
tackle the so called big data challenge. Due to its nature and complexity, the analysis of big data raises
new issues and challenges (Li et al., 2014) (Suthaharan, 2014). Although many machine learning approaches have been proposed so far to analyse small to medium size data sets, in a supervised or unsupervised way, just few of them have been properly adapted to handle large data sets (Yadav et al., 2013)
(Dhillon et al., 2014) (Pakize et al., 2014). An overview of some data mining approaches for very large data sets can be found in (He et al., 2010) (Bekkerman et al., 2012)
(Nandakumar et al., 2014).
By:V. A. Ayma a, *, R. S. Ferreira a, P. Happ a, D. Oliveira a, R. Feitosa a, b, G. Costa a, A. Plaza c, P. Gamba
File Information:English Language/5 Page/Size:836 K
کاری از:V. A. Ayma a, R. S. Ferreira a, P. Happ a, D. Oliveira a, R. Feitosa a, b, G. Costa a, A. Plaza c, P. Gamba
اطلاعات فایل:زبان انگلیسی/5صفحه/حجم:836 K
لینک دانلود:روی همین لینک کلیک کنید
نقد و بررسیها
هیچ دیدگاهی برای این محصول نوشته نشده است.