Consensus Clustering Based on a New Probabilistic Rand Index with Application to Subtopic Retrieval

被引:47
作者
Carpineto, Claudio [1 ]
Romano, Giovanni [1 ]
机构
[1] Fdn Ugo Bordoni, I-00161 Rome, Italy
关键词
Consensus clustering; Rand index; probabilistic Rand index; search results clustering; subtopic retrieval;
D O I
10.1109/TPAMI.2012.80
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We introduce a probabilistic version of the well-known Rand Index (RI) for measuring the similarity between two partitions, called Probabilistic Rand Index (PRI), in which agreements and disagreements at the object-pair level are weighted according to the probability of their occurring by chance. We then cast consensus clustering as an optimization problem of the PRI value between a target partition and a set of given partitions, experimenting with a simple and very efficient stochastic optimization algorithm. Remarkable performance gains over input partitions as well as over existing related methods are demonstrated through a range of applications, including a new use of consensus clustering to improve subtopic retrieval.
引用
收藏
页码:2315 / 2326
页数:12
相关论文
共 44 条
[41]  
Vinh NX, 2009, P 26 ANN INT C MACH, DOI DOI 10.1145/1553374.1553511
[42]  
WAKABAYASHI Y, 1998, RESENHAS IME USP, V3, P323
[43]  
Wang P, 2010, LECT NOTES ARTIF INT, V6323, P435, DOI 10.1007/978-3-642-15939-8_28
[44]   Clustering aggregation by probability accumulation [J].
Wang, Xi ;
Yang, Chunyu ;
Zhou, Jie .
PATTERN RECOGNITION, 2009, 42 (05) :668-675