توضیحات
ABSTRACT
The rapid advance of computer technologies in data processing, collection, and storage has provided unparalleled opportunities to expand capabilities in production, services, commu- nications, and research. However, immense quantities of high-dimensional data renew the challenges to the state-of-the-art data mining techniques. Feature selection is an eective technique for dimension reduction and an essential step in successful data mining appli- cations. It is a research area of great practical signicance and has been developed and evolved to answer the challenges due to data of increasingly high dimensionality. Its direct benets include: building simpler and more comprehensible models, improving data mining performance, and helping prepare, clean, and understand data. We rst brie y introduce the key components of feature selection, and review its developments with the growth of
data mining. We then overview FSDM and the papers of FSDM10, which showcases of a vi- brant research eld of some contemporary interests, new applications, and ongoing research eorts. We then examine nascent demands in data-intensive applications and identify some potential lines of research that require multidisciplinary eorts.
INTRODUCTION
Data mining is a multidisciplinary eort to extract nuggets of knowledge from data. The proliferation of large data sets within many domains poses unprecedented challenges to data mining (Han and Kamber, 2001). Not only are data sets getting larger, but new types of data become prevalent, such as data streams on the Web, microarrays in genomics and proteomics, and networks in social computing and system biology. Researchers are realizing that in order to achieve successful data mining, feature selection is an indispensable component (Liu and Motoda, 1998; Guyon and Elissee, 2003; Liu and Motoda, 2007). It is a process of selecting a subset of original features according to certain criteria, and an important and frequently used technique in data mining for dimension reduction. It reduces the number of features, removes irrelevant, redundant, or noisy features, and brings about palpable eects for applications: speeding up a data mining algorithm, improving learning accuracy, and leading to better model comprehensibility
Year : 2010
Publisher : Workshop and Conference Proceedings
By : Huan Liu , Hiroshi Motoda , Rudy Setiono , Zheng Zhao
File Information : English Language / 10 Page / Size : 181 KB
Download : click
سال : 2010
ناشر : Workshop and Conference Proceedings
کاری از : Huan Liu , Hiroshi Motoda , Rudy Setiono , Zheng Zhao
اطلاعات فایل : زبان انگلیسی / 10 صفحه / حجم : 181KB
لینک دانلود : روی همین لینک کلیک کنید
نقد و بررسیها
هنوز بررسیای ثبت نشده است.