Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval

被引:47
作者
Carpineto, Claudio [1 ]
Romano, Giovanni [1 ]
机构
[1] Fdn Ugo Bordoni, I-00161 Rome, Italy
关键词
Consensus clustering; Rand index; probabilistic Rand index; search results clustering; subtopic retrieval;
D O I
10.1109/TPAMI.2012.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a probabilistic version of the well-known Rand Index (RI) for measuring the similarity between two partitions, called Probabilistic Rand Index (PRI), in which agreements and disagreements at the object-pair level are weighted according to the probability of their occurring by chance. We then cast consensus clustering as an optimization problem of the PRI value between a target partition and a set of given partitions, experimenting with a simple and very efficient stochastic optimization algorithm. Remarkable performance gains over input partitions as well as over existing related methods are demonstrated through a range of applications, including a new use of consensus clustering to improve subtopic retrieval.
引用
收藏
页码:2315 / 2326
页数:12
相关论文
共 44 条
[1]  
[Anonymous], P SIAM INT C DAT MIN
[2]  
Azimi J, 2009, 21ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE (IJCAI-09), PROCEEDINGS, P992
[3]  
Bansal N, 2002, ANN IEEE SYMP FOUND, P238, DOI 10.1109/SFCS.2002.1181947
[4]  
Bekkerman R, 2009, KDD-09: 15TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, P99
[5]  
Ben-Hur Asa, 2002, Pac Symp Biocomput, P6
[6]  
Bernardini A, 2009, 2009 IEEE/WIC/ACM INTERNATIONAL JOINT CONFERENCES ON WEB INTELLIGENCE (WI) AND INTELLIGENT AGENT TECHNOLOGIES (IAT), VOL 1, P206
[7]   On the approximation of correlation clustering and consensus clustering [J].
Bonizzoni, Paola ;
Della Vedova, Gianluca ;
Dondi, Riccardo ;
Jiang, Tao .
JOURNAL OF COMPUTER AND SYSTEM SCIENCES, 2008, 74 (05) :671-696
[8]   A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment [J].
Campello, R. J. G. B. .
PATTERN RECOGNITION LETTERS, 2007, 28 (07) :833-841
[9]  
Carpineto C, 2010, SIGIR 2010: PROCEEDINGS OF THE 33RD ANNUAL INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH DEVELOPMENT IN INFORMATION RETRIEVAL, P170
[10]   A Survey of Web Clustering Engines [J].
Carpineto, Claudio ;
Osinski, Stanislaw ;
Romano, Giovanni ;
Weiss, Dawid .
ACM COMPUTING SURVEYS, 2009, 41 (03)