Interpretable hierarchical clustering by constructing an unsupervised decision tree

被引:82
作者
Basak, J [1 ]
Krishnapuram, R [1 ]
机构
[1] Indian Inst Technol, IBM India Res Lab, New Delhi 110016, India
关键词
unsupervised decision tree; entropy; data set segmentation;
D O I
10.1109/TKDE.2005.11
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we propose a method for hierarchical clustering based on the decision tree approach. As in the case of supervised decision tree, the unsupervised decision tree is interpretable in terms of rules, i.e., each leaf node represents a cluster, and the path from the root node to a leaf node represents a rule. The branching decision at each node of the tree is made based on the clustering tendency of the data available at the node. We present four different measures for selecting the most appropriate attribute to be used for splitting the data at every branching node (or decision node), and two different algorithms for splitting the data at each decision node. We provide a theoretical basis for the approach and demonstrate the capability of the unsupervised decision tree for segmenting various data sets. We also compare the performance of the unsupervised decision tree with that of the supervised one.
引用
收藏
页码:121 / 132
页数:12
相关论文
共 33 条
[1]  
[Anonymous], 1997, Proceedings of the fourteenth international conference on machine learning, DOI DOI 10.1016/J.ESWA.2008.05.026
[2]   Unsupervised feature selection using a neuro-fuzzy approach [J].
Basak, J ;
De, RK ;
Pal, SK .
PATTERN RECOGNITION LETTERS, 1998, 19 (11) :997-1006
[3]   A CLASSIFICATION OF PRESENCE ABSENCE BASED DISSIMILARITY COEFFICIENTS [J].
BAULIEU, FB .
JOURNAL OF CLASSIFICATION, 1989, 6 (02) :233-246
[4]  
BELLOT P, 2000, P RIAO 2000 C, P344
[5]  
Ben-Dor A, 2001, P 5 ANN INT C COMP M, P31, DOI DOI 10.1145/369133.369167
[6]  
Bock H.H., 1989, CONCEPTUAL NUMERICAL, P12
[7]  
BOCK HH, 1994, P 1 US JAP C FRONT S
[8]  
BOLEY D, 1998, TR98012 U MINN DEP C
[9]  
Breiman L., 1998, CLASSIFICATION REGRE
[10]  
BRODLEY CE, 1995, MACH LEARN, V19, P45, DOI 10.1007/BF00994660