A human-computer interactive method for projected clustering

被引:34
作者
Aggarwal, CC [1 ]
机构
[1] IBM Corp, TJ Watson Res Ctr, Hawthorne, NY 10532 USA
关键词
high-dimensional data mining; clustering; human-computer interaction;
D O I
10.1109/TKDE.2004.1269669
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a central task in data mining applications such as customer segmentation. High-dimensional data has always been a challenge for clustering algorithms because of,the inherent sparsity of the points. Therefore, techniques have recently been proposed to find clusters in hidden subspaces of the data. However, since the behavior of the data can vary considerably in different subspaces, it is often difficult to define the notion of a cluster with the use of simple mathematical formalizations. The widely used practice of treating clustering as the exact problem of optimizing an arbitrarily chosen objective function can often lead to misleading results. In fact, the proper clustering definition may vary not only with the application and data set but also with the perceptions of the end user. This makes it difficult to separate the definition of the clustering problem from the perception of an end-user. In this paper, we propose a system which performs high-dimensional clustering by cooperation between the human and the computer. The complex task of cluster creation is accomplished through a combination of human intuition and the computational support provided by the computer. The result is a system which leverages the best abilities of both the human and the computer for solving the clustering problem.
引用
收藏
页码:448 / 460
页数:13
相关论文
共 32 条
[1]  
Aggarwal CC, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P61, DOI 10.1145/304181.304188
[2]  
AGGARWAL CC, 2001, P 7 ACM SIGKDD INT C, P221
[3]  
AGGARWAL CC, 1970, P ACM SIGMOD C, P70
[4]  
Agrawal R, 1994, P 20 INT C VER LARG, V1215, P487
[5]  
Aho A., 1987, DATA STRUCTURES ALGO
[6]  
Ankerst M., 2000, Proceedings. KDD-2000. Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, P179, DOI 10.1145/347090.347124
[7]  
[Anonymous], 1999, KDD '99: Proceedings of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, DOI DOI 10.1145/312129.312298
[8]  
[Anonymous], P 8 INT C DAT THEOR
[9]  
Bezdek JC, 1999, HDB FUZZY SETS SERIE
[10]  
BYER K, 1999, P 7 INT C DAT THEOR, P217