توضیحات
ABSTRACT
. Data mining applications place special requirements on clustering algorithms including: the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records. We present CLIQUE, a clustering algorithm that satisfies each of these requirements. CLIQUE identifies dense clusters in subspaces of maximum dimensionality. It generates cluster descriptions in the form of DNF expressions that are minimized for ease of comprehension. It produces identical results irrespective of the order in which input records are presented and does not presume any specific mathematical form for data distribution. Through experiments, we show that CLIQUE efficiently finds accurate clusters in large high dimensional datasets.
INTRODUCTION
Clustering is a descriptive task that seeks to identify homogeneous groups of objects based on the values of their attributes (dimensions) (Jain and Dubes, 1988; Kaufman and Rousseeuw, 1990). Clustering techniques have been studied extensively in statistics (Arabie and Hubert, 1996), pattern recognition (Duda and Hart, 1973; Fukunaga, 1990), and machine learning (Cheeseman and Stutz, 1996; Michalski and Stepp, 1983). Recent work in the database community includes CLARANS (Ng and Han, 1994), Focused CLARANS (Ester et al., 1995), BIRCH (Zhang et al., 1996), DBSCAN (Ester et al., 1996) and CURE (Guha et al., 1998).
Year: 2005
Publisher: SPRINGER
By: RAKESH AGRAWAL,JOHANNES GEHRKE,DIMITRIOS GUNOPULOS,PRABHAKAR RAGHAVAN
File Information: English Language/ 29 Page / size:627KB
Download: click
سال : 2005
ناشر : SPRINGER
کاری از : RAKESH AGRAWAL,JOHANNES GEHRKE,DIMITRIOS GUNOPULOS,PRABHAKAR RAGHAVAN
اطلاعات فایل : زبان انگلیسی / 29 صفحه / حجم : 627KB
لینک دانلود : روی همین لینک کلیک کنید
نقد و بررسیها
هنوز بررسیای ثبت نشده است.