Local equivalences of distances between clusterings—a geometric perspective

被引:0
作者
Marina Meilă
机构
[1] University of Washington,Department of Statistics
来源
Machine Learning | 2012年 / 86卷
关键词
Clustering; Comparing partitions; divergence; Misclassification error; Rand index; Convexity;
D O I
暂无
中图分类号
学科分类号
摘要
In comparing clusterings, several different distances and indices are in use. We prove that the Misclassification Error distance, the Hamming distance (equivalent to the unadjusted Rand index), and the χ2 distance between partitions are equivalent in the neighborhood of 0. In other words, if two partitions are very similar, then one distance defines upper and lower bounds on the other and viceversa. The proofs are geometric and rely on the concavity of the distances. The geometric intuitions themselves advance the understanding of the space of all clusterings. To our knowledge, this is the first result of its kind.
引用
收藏
页码:369 / 389
页数:20
相关论文
共 11 条
[1]  
Bach F.(2006)Learning spectral clustering with applications to speech separation Journal of Machine Learning Research 7 1963-2001
[2]  
Jordan M. I.(2005)The Dantzig selector: statistical estimation when Annals of Statistics 35 2313-2351
[3]  
Candès E. J.(1995) is much larger than  Machine Learning 20 273-297
[4]  
Tao T.(2006)Support vector networks IEEE Transactions on Information Theory 52 1289-1306
[5]  
Cortes C.(1985)Compressed sensing Journal of Classification 2 193-218
[6]  
Vapnik V.(2007)Comparing partitions Journal of Multivariate Analysis 98 873-895
[7]  
Donoho D.(1971)Comparing clusterings—an information based distance Journal of the American Statistical Association 66 846-850
[8]  
Hubert L.(undefined)Objective criteria for the evaluation of clustering methods undefined undefined undefined-undefined
[9]  
Arabie P.(undefined)undefined undefined undefined undefined-undefined
[10]  
Meilă M.(undefined)undefined undefined undefined undefined-undefined