An ensemble hierarchical clustering algorithm based on merits at cluster and partition levels

被引:20
|
作者
Huang, Qirui [1 ]
Gao, Rui [2 ]
Akhavan, Hoda [3 ]
机构
[1] Nanyang Inst Technol, Sch Informat Engn, Nanyang 473004, Henan, Peoples R China
[2] Dongying Vocat Inst, Acad Affairs Off, Dongying 257000, Shandong, Peoples R China
[3] Amirkabir Univ Technol, Comp Engn & Informat Technol Dept, Tehran, Iran
关键词
Ensemble clustering; Cluster consensus; Hyper; -cluster; Merit level; Robustness measure; QUALITY; PREDICTION; DIVERSITY; CRITERION; SELECTION;
D O I
10.1016/j.patcog.2022.109255
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble clustering has emerged as a combination of several basic clustering algorithms to achieve high quality final clustering. However, this technique is challenging due to the complexities in primary clus-ters such as overlapping, vagueness, instability and uncertainty. Typically, ensemble clustering uses all the primary clusters into partitions for consensus, where the merits of a cluster or a partition can be con-sidered to improve the quality of the consensus. In general, the robustness of a partition may be poorly measured, while having some high-quality clusters. Inspired by the evaluation of cluster and partition, this paper proposes an ensemble hierarchical clustering algorithm based on the cluster consensus selec-tion approach. Here, the selection of a subset of primary clusters from partitions based on their merit level is emphasized. Merit level is defined using the development of Normalized Mutual Information measure. Clusters of basic clustering algorithms that satisfy the predefined threshold of this measure are selected to participate in the final consensus. In addition, the consensus of the selected primary clusters to create the final clusters is performed based on the clusters clustering technique. In this technique, the selected primary clusters are re-clustered to create hyper-clusters. Finally, the final clusters are formed by assigning instances to hyper-clusters with the highest similarity. Here, an innovative criterion based on merit and cluster size for defining similarity is presented. The performance of the proposed algorithm has been proven by extensive experiments on real-world datasets from the UCI repository compared to state-of-the-art algorithms such as CPDM, ENMI, IDEA, CFTLC and SSCEN.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] A decentralized algorithm for distributed ensemble clustering
    Rosato, Antonello
    Altilio, Rosa
    Panella, Massimo
    INFORMATION SCIENCES, 2021, 578 : 417 - 434
  • [22] A negative selection algorithm based on hierarchical clustering of self set
    CHEN Wen
    LI Tao
    LIU XiaoJie
    ZHANG Bing
    ScienceChina(InformationSciences), 2013, 56 (08) : 203 - 215
  • [23] A negative selection algorithm based on hierarchical clustering of self set
    Wen Chen
    Tao Li
    XiaoJie Liu
    Bing Zhang
    Science China Information Sciences, 2013, 56 : 1 - 13
  • [24] A negative selection algorithm based on hierarchical clustering of self set
    Chen Wen
    Li Tao
    Liu XiaoJie
    Zhang Bing
    SCIENCE CHINA-INFORMATION SCIENCES, 2013, 56 (08) : 1 - 13
  • [25] An Improved Method for Multi-objective Clustering Ensemble Algorithm
    Liu, Ruochen
    Liu, Yong
    Li, Yangyang
    2012 IEEE CONGRESS ON EVOLUTIONARY COMPUTATION (CEC), 2012,
  • [26] Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion
    Ping Tian
    Huitao Shen
    Ahad Abolfathi
    Journal of Grid Computing, 2022, 20
  • [27] Towards Efficient Ensemble Hierarchical Clustering with MapReduce-based Clusters Clustering Technique and the Innovative Similarity Criterion
    Tian, Ping
    Shen, Huitao
    Abolfathi, Ahad
    JOURNAL OF GRID COMPUTING, 2022, 20 (04)
  • [28] Cluster ensemble selection and consensus clustering: A multi-objective optimization approach
    Aktas, Dilay
    Lokman, Banu
    Inkaya, Tulin
    Dejaegere, Gilles
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 314 (03) : 1065 - 1077
  • [29] DICLENS: Divisive Clustering Ensemble with Automatic Cluster Number
    Mimaroglu, Selim
    Aksehirli, Emin
    IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2012, 9 (02) : 408 - 420
  • [30] Cluster's Quality Evaluation and Selective Clustering Ensemble
    Li, Feijiang
    Qian, Yuhua
    Wang, Jieting
    Dang, Chuangyin
    Liu, Bing
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2018, 12 (05)