Multi-view clustering

被引:575
作者
Bickel, S [1 ]
Scheffer, T [1 ]
机构
[1] Humboldt Univ, Dept Comp Sci, D-10099 Berlin, Germany
来源
FOURTH IEEE INTERNATIONAL CONFERENCE ON DATA MINING, PROCEEDINGS | 2004年
关键词
D O I
10.1109/ICDM.2004.10095
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We consider clustering problems in which the available attributes can be split into two independent subsets, such that either subset suffices for learning. Example applications of this multi-view setting include clustering of web pages which have an intrinsic view,(the pages themselves) and an extrinsic view (e.g., anchor texts of inbound hyperlinks); multi-view learning has so far been studied in the context of classification. We develop and study partitioning and agglomerative, hierarchical multi-view clustering algorithms for text data. We find empirically that the multiview versions of k-Means and EM greatly improve on their single-view counterparts. By contrast, we obtain negative results for agglomerative hierarchical multi-view clustering. Our analysis explains this surprising phenomenon.
引用
收藏
页码:19 / 26
页数:8
相关论文
共 20 条
[1]  
Abney S., 2002, P 40 ANN M ASS COMP
[2]  
[Anonymous], J ROYAL STAT SOC B
[3]  
BANERJEE A, 2003, P 9 ACM SIGKDD C KNO
[4]  
BERKHIN P, 2002, UNPUB SURVEY CLUSTER
[5]  
Blum A., 1998, Proceedings of the Eleventh Annual Conference on Computational Learning Theory, P92, DOI 10.1145/279943.279962
[6]  
Brefeld U., 2004, P INT C MACH LEARN
[7]  
Collins Michael, 1999, EMNLP
[8]  
Dasgupta S., 2001, P NEUR INF PROC SYST
[9]   Concept decompositions for large sparse text data using clustering [J].
Dhillon, IS ;
Modha, DS .
MACHINE LEARNING, 2001, 42 (1-2) :143-175
[10]  
Ghani R., 2002, P INT C MACH LEARN