An ensemble hierarchical clustering algorithm based on merits at cluster and partition levels

被引:20
|
作者
Huang, Qirui [1 ]
Gao, Rui [2 ]
Akhavan, Hoda [3 ]
机构
[1] Nanyang Inst Technol, Sch Informat Engn, Nanyang 473004, Henan, Peoples R China
[2] Dongying Vocat Inst, Acad Affairs Off, Dongying 257000, Shandong, Peoples R China
[3] Amirkabir Univ Technol, Comp Engn & Informat Technol Dept, Tehran, Iran
关键词
Ensemble clustering; Cluster consensus; Hyper; -cluster; Merit level; Robustness measure; QUALITY; PREDICTION; DIVERSITY; CRITERION; SELECTION;
D O I
10.1016/j.patcog.2022.109255
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble clustering has emerged as a combination of several basic clustering algorithms to achieve high quality final clustering. However, this technique is challenging due to the complexities in primary clus-ters such as overlapping, vagueness, instability and uncertainty. Typically, ensemble clustering uses all the primary clusters into partitions for consensus, where the merits of a cluster or a partition can be con-sidered to improve the quality of the consensus. In general, the robustness of a partition may be poorly measured, while having some high-quality clusters. Inspired by the evaluation of cluster and partition, this paper proposes an ensemble hierarchical clustering algorithm based on the cluster consensus selec-tion approach. Here, the selection of a subset of primary clusters from partitions based on their merit level is emphasized. Merit level is defined using the development of Normalized Mutual Information measure. Clusters of basic clustering algorithms that satisfy the predefined threshold of this measure are selected to participate in the final consensus. In addition, the consensus of the selected primary clusters to create the final clusters is performed based on the clusters clustering technique. In this technique, the selected primary clusters are re-clustered to create hyper-clusters. Finally, the final clusters are formed by assigning instances to hyper-clusters with the highest similarity. Here, an innovative criterion based on merit and cluster size for defining similarity is presented. The performance of the proposed algorithm has been proven by extensive experiments on real-world datasets from the UCI repository compared to state-of-the-art algorithms such as CPDM, ENMI, IDEA, CFTLC and SSCEN.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] An Ensemble Clustering Framework Based on Hierarchical Clustering Ensemble Selection and Clusters Clustering
    Li, Wenjun
    Wang, Zikang
    Sun, Wei
    Bahrami, Sara
    CYBERNETICS AND SYSTEMS, 2023, 54 (05) : 741 - 766
  • [2] A novel member enhancement-based clustering ensemble algorithm
    He, Yulin
    Yang, Jin
    Cheng, Yingchao
    Du, Xueqin
    Huang, Joshua Zhexue
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2024, 36 (10)
  • [3] An ensemble agglomerative hierarchical clustering algorithm based on clusters clustering technique and the novel similarity measurement
    Li, Teng
    Rezaeipanah, Amin
    El Din, ElSayed M. Tag
    JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (06) : 3828 - 3842
  • [4] Dependability-based cluster weighting in clustering ensemble
    Najafi, Fatemeh
    Parvin, Hamid
    Mirzaie, Kamal
    Nejatian, Samad
    Rezaie, Vahideh
    STATISTICAL ANALYSIS AND DATA MINING, 2020, 13 (02) : 151 - 164
  • [5] Hierarchical cluster ensemble selection
    Akbari, Ebrahim
    Dahlan, Halina Mohamed
    Ibrahim, Roliana
    Alizadeh, Hosein
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2015, 39 : 146 - 156
  • [6] A Clustering Ensemble Method Based on Cluster Selection and Cluster Splitting
    Tang, Yuyang
    Liu, Xiabi
    PROCEEDINGS OF 2018 10TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND COMPUTING (ICMLC 2018), 2018, : 54 - 58
  • [7] A hierarchical fuzzy cluster ensemble approach and its application to big data clustering
    Su, Pan
    Shang, Changjing
    Shen, Qiang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2015, 28 (06) : 2409 - 2421
  • [8] SEP/COP: An efficient method to find the best partition in hierarchical clustering based on a new cluster validity index
    Gurrutxaga, Ibai
    Albisua, Inaki
    Arbelaitz, Olatz
    Martin, Jose I.
    Muguerza, Javier
    Perez, Jesus M.
    Perona, Inigo
    PATTERN RECOGNITION, 2010, 43 (10) : 3364 - 3373
  • [9] Tumor Clustering based on Hybrid Cluster Ensemble Framework
    Yu, Zhiwen
    You, Jane
    Chen, Hantao
    Li, Le
    Wang, Xiaowei
    2012 INTERNATIONAL CONFERENCE ON COMPUTERIZED HEALTHCARE (ICCH), 2012, : 99 - +
  • [10] BINARIZATION OF CONSENSUS PARTITION MATRIX FOR ENSEMBLE CLUSTERING
    Abu-Jamous, Basel
    Fa, Rui
    Nandi, Asoke K.
    Roberts, David J.
    2012 PROCEEDINGS OF THE 20TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2012, : 2193 - 2197