Cluster ensemble selection based on a new cluster stability measure

被引:59
作者
Alizadeh, Hosein [1 ]
Minaei-Bidgoli, Behrouz [1 ]
Parvin, Hamid [1 ]
机构
[1] Iran Univ Sci & Technol, Sch Comp Engn, Tehran, Iran
关键词
Clustering ensemble; APMM stability measure; extended evidence accumulation clustering; cluster evaluation; VALIDATION;
D O I
10.3233/IDA-140647
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Many stability measures, such as Normalized Mutual Information (NMI), have been proposed to validate a set of partitionings. It is highly possible that a set of partitionings may contain one (or more) high quality cluster(s) but is still adjudged a bad cluster by a stability measure, and as a result, is completely neglected. Inspired by evaluation approaches measuring the efficacy of a set of partitionings, researchers have tried to define new measures for evaluating a cluster. Thus far, the measures defined for assessing a cluster are mostly based on the well-known NMI measure. The drawback of this commonly used approach is discussed in this paper, after which a new asymmetric criterion, called the Alizadeh-Parvin-Moshki-Minaei criterion (APMM), is proposed to assess the association between a cluster and a set of partitionings. We show that the APMM criterion overcomes the deficiency in the conventional NMI measure. We also propose a clustering ensemble framework that incorporates the APMM's capabilities in order to find the best performing clusters. The framework uses Average APMM (AAPMM) as a fitness measure to select a number of clusters instead of using all of the results. Any cluster that satisfies a predefined threshold of the mentioned measure is selected to participate in an elite ensemble. To combine the chosen clusters, a co-association matrix-based consensus function (by which the set of resultant partitionings are obtained) is used. Because Evidence Accumulation Clustering (EAC) can not derive the co- association matrix from a subset of clusters appropriately, a new EAC-based method, called Extended EAC (EEAC), is employed to construct the co-association matrix from the chosen subset of clusters. Empirical studies show that our proposed approach outperforms other cluster ensemble approaches.
引用
收藏
页码:389 / 408
页数:20
相关论文
共 39 条
[1]  
Alizadeh H., 2012, IAENG INT J COMPUTER, V39, P1
[2]  
ALIZADEH H, 2011, ARTIFICIAL INTELLI 1, P240
[3]   OPTIMIZING FUZZY CLUSTER ENSEMBLE IN STRING REPRESENTATION [J].
Alizadeh, Hosein ;
Minaei-Bidgoli, Behrouz ;
Parvin, Hamid .
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2013, 27 (02)
[4]  
[Anonymous], 2007, MUSIC BRAIN COGNIT 2
[5]  
[Anonymous], ADV NEURAL INFORM PR
[6]  
[Anonymous], 2001, APPL RESAMPLING METH
[7]  
[Anonymous], IEEE INT C FUZZ SYST
[8]  
[Anonymous], ACM T KNOWL DISC DAT
[9]  
[Anonymous], INT C COMP STAT COMP
[10]  
[Anonymous], P 6 INT C HYBR INT S