Quality indices for (practical) clustering evaluation

被引:13
作者
Cardoso, Margarida G. M. S. [1 ]
de Carvalho, Andre Ponce de Leon F. [2 ]
机构
[1] ISCTE Business Sch, Dept Quantitat Methods, P-1649026 Lisbon, Portugal
[2] Univ Sao Paulo, Inst Math & Comp Sci, Dept Comp Sci, BR-13560970 Sao Carlos, SP, Brazil
关键词
Cluster validation; validation indices; quality indices; clustering; VALIDATION INDEX; VALIDITY; NUMBER;
D O I
10.3233/IDA-2009-0390
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering quality or validation indices allow the evaluation of the quality of clustering in order to support the selection of a specific partition or clustering structure in its natural unsupervised environment, where the real solution is unknown or not available. In this paper, we investigate the use of quality indices mostly based on the concepts of clusters' compactness and separation, for the evaluation of clustering results (partitions in particular). This work intends to offer a general perspective regarding the appropriate use of quality indices for the purpose of clustering evaluation. After presenting some commonly used indices, as well as indices recently proposed in the literature, key issues regarding the practical use of quality indices are addressed. A general methodological approach is presented which considers the identification of appropriate indices thresholds. This general approach is compared with the simple use of quality indices for evaluating a clustering solution.
引用
收藏
页码:725 / 740
页数:16
相关论文
共 44 条
[1]   MAXIMUM LIKELIHOOD IDENTIFICATION OF GAUSSIAN AUTOREGRESSIVE MOVING AVERAGE MODELS [J].
AKAIKE, H .
BIOMETRIKA, 1973, 60 (02) :255-265
[2]  
[Anonymous], MATH CLASSIFICATION
[3]  
[Anonymous], MSUCSE035
[4]  
Asuncion Arthur, 2007, Uci machine learning repository
[5]   Some new indexes of cluster validity [J].
Bezdek, JC ;
Pal, NR .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1998, 28 (03) :301-315
[6]   ON SOME SIGNIFICANCE TESTS IN CLUSTER-ANALYSIS [J].
BOCK, HH .
JOURNAL OF CLASSIFICATION, 1985, 2 (01) :77-108
[7]  
BOCK HH, 1996, CLUSTERING CLASSIFIC
[8]   Cluster validation techniques for genome expression data [J].
Bolshakova, N ;
Azuaje, F .
SIGNAL PROCESSING, 2003, 83 (04) :825-833
[9]   An objective approach to cluster validation [J].
Bouguessa, Mohamed ;
Wang, Shengrui ;
Sun, Haojun .
PATTERN RECOGNITION LETTERS, 2006, 27 (13) :1419-1430