Interpretable Clustering via Discriminative Rectangle Mixture Model

被引:0
作者
Chen, Junxiang [1 ]
Chang, Yale [1 ]
Hobbs, Brian [2 ]
Castaldi, Peter [2 ]
Cho, Michael [2 ]
Silverman, Edwin [2 ]
Dy, Jennifer [1 ]
机构
[1] Northeastern Univ, Dept Elect & Comp Engn, Boston, MA 02115 USA
[2] Brigham & Womens Hosp, Channing Div Network Med, 75 Francis St, Boston, MA 02115 USA
来源
2016 IEEE 16TH INTERNATIONAL CONFERENCE ON DATA MINING (ICDM) | 2016年
关键词
interpretable clustering; semi-supervised clustering;
D O I
10.1109/ICDM.2016.166
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a technique that is usually applied as a tool for exploratory data analysis. Because of the exploratory nature of this task, it would be beneficial if a clustering method generates interpretable results, and allows incorporating domain knowledge. This motivates us to develop a probabilistic discriminative model that learns a rectangular decision rule for each cluster, we call Discriminative Rectangle Mixture (DReaM) model. DReaM gives interpretable clustering results, because the rectangular decision rules discovered explicitly illustrate how one cluster is defined and differs from other clusters. It also facilitates us to take advantage of existing rules because we can choose informative prior distributions for the rectangular rules. Moreover, DReaM allows that the features for generating rules do not have to be the same as the features for discovering cluster structure. We approximate the distribution for the rules discovered via variational inference. Experimental results demonstrate that DReaM gives more interpretable clustering results, and yet its performance is comparable to existing clustering methods when solving traditional clustering. Furthermore, in real applications, DReaM is able to effectively take advantage of domain knowledge, and to generate reasonable clustering results.
引用
收藏
页码:823 / 828
页数:6
相关论文
共 15 条
[1]  
Bache K., 2013, UCI Machine Learning Repository
[2]  
Bishop C.M., 2006, PATTERN RECOGN, V4, P738, DOI DOI 10.1117/1.2819119
[3]   Interpretable clustering using unsupervised binary trees [J].
Fraiman, Ricardo ;
Ghattas, Badih ;
Svarc, Marcela .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2013, 7 (02) :125-145
[4]  
Kim B., 2015, ADV NEURAL INFORM PR, P2251
[5]  
Klein D., 2002, P 19 INT C MACH LEAR, P307
[6]   USE OF RANKS IN ONE-CRITERION VARIANCE ANALYSIS [J].
KRUSKAL, WH ;
WALLIS, WA .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1952, 47 (260) :583-621
[7]  
Liu B., 2000, P 9 INT C INF KNOWL, P20, DOI DOI 10.1145/354756.354775
[8]  
MacQueen, 1967, BERK S MATH STAT PRO, DOI DOI 10.1007/S11665-016-2173-6
[9]  
Niu D., 2012, J MACH LEARN RES, V22, P814
[10]   Iterative Discovery of Multiple Alternative Clustering Views [J].
Niu, Donglin ;
Dy, Jennifer G. ;
Jordan, Michael I. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2014, 36 (07) :1340-1353