An ensemble hierarchical clustering algorithm based on merits at cluster and partition levels

被引:20
|
作者
Huang, Qirui [1 ]
Gao, Rui [2 ]
Akhavan, Hoda [3 ]
机构
[1] Nanyang Inst Technol, Sch Informat Engn, Nanyang 473004, Henan, Peoples R China
[2] Dongying Vocat Inst, Acad Affairs Off, Dongying 257000, Shandong, Peoples R China
[3] Amirkabir Univ Technol, Comp Engn & Informat Technol Dept, Tehran, Iran
关键词
Ensemble clustering; Cluster consensus; Hyper; -cluster; Merit level; Robustness measure; QUALITY; PREDICTION; DIVERSITY; CRITERION; SELECTION;
D O I
10.1016/j.patcog.2022.109255
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Ensemble clustering has emerged as a combination of several basic clustering algorithms to achieve high quality final clustering. However, this technique is challenging due to the complexities in primary clus-ters such as overlapping, vagueness, instability and uncertainty. Typically, ensemble clustering uses all the primary clusters into partitions for consensus, where the merits of a cluster or a partition can be con-sidered to improve the quality of the consensus. In general, the robustness of a partition may be poorly measured, while having some high-quality clusters. Inspired by the evaluation of cluster and partition, this paper proposes an ensemble hierarchical clustering algorithm based on the cluster consensus selec-tion approach. Here, the selection of a subset of primary clusters from partitions based on their merit level is emphasized. Merit level is defined using the development of Normalized Mutual Information measure. Clusters of basic clustering algorithms that satisfy the predefined threshold of this measure are selected to participate in the final consensus. In addition, the consensus of the selected primary clusters to create the final clusters is performed based on the clusters clustering technique. In this technique, the selected primary clusters are re-clustered to create hyper-clusters. Finally, the final clusters are formed by assigning instances to hyper-clusters with the highest similarity. Here, an innovative criterion based on merit and cluster size for defining similarity is presented. The performance of the proposed algorithm has been proven by extensive experiments on real-world datasets from the UCI repository compared to state-of-the-art algorithms such as CPDM, ENMI, IDEA, CFTLC and SSCEN.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:17
相关论文
共 50 条
  • [31] Adaptive Fuzzy Exponent Cluster Ensemble System Based Feature Selection and Spectral Clustering
    Ben Ayed, Abdelkarim
    Ben Halima, Mohamed
    Alimi, Adel M.
    2017 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE), 2017,
  • [32] HCFS: A Density Peak Based Clustering Algorithm Employing A Hierarchical Strategy
    Zhuo, Linlin
    Li, Kenli
    Liao, Bo
    Li, Hao
    Wei, Xiaohui
    Li, Keqin
    IEEE ACCESS, 2019, 7 : 74612 - 74624
  • [33] On an ensemble algorithm for clustering cancer patient data
    Qi, Ran
    Wu, Dengyuan
    Sheng, Li
    Henson, Donald
    Schwartz, Arnold
    Xu, Eric
    Xing, Kai
    Chen, Dechang
    BMC SYSTEMS BIOLOGY, 2013, 7
  • [34] LWMC: A Locally Weighted Meta-Clustering Algorithm for Ensemble Clustering
    Huang, Dong
    Wang, Chang-Dong
    Lai, Jian-Huang
    NEURAL INFORMATION PROCESSING, ICONIP 2017, PT V, 2017, 10638 : 167 - 176
  • [35] An Algorithm for Automatic Recognition of Cluster Centers Based on Local Density Clustering
    Ye Xuanzuo
    Li Dinghao
    He Xiongxiong
    2017 29TH CHINESE CONTROL AND DECISION CONFERENCE (CCDC), 2017, : 1347 - 1351
  • [36] Semi-supervised hierarchical ensemble clustering based on an innovative distance metric and constraint information
    Shen, Baohua
    Jiang, Juan
    Qian, Feng
    Li, Daoguo
    Ye, Yanming
    Ahmadi, Gholamreza
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 124
  • [37] Utilizing Cluster Quality in Hierarchical Clustering for Analogy-Based Software Effort Estimation
    Wu, Jack H. C.
    Keung, Jacky W.
    PROCEEDINGS OF 2017 8TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS 2017), 2017, : 1 - 4
  • [38] Clustering ensemble selection based on the extended Jaccard measure
    Khalili, Hajar
    Rabbani, Mohsen
    Akbari, Ebrahim
    TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2021, 29 (04) : 2215 - 2231
  • [39] Security Clustering Algorithm Based on Reputation in Hierarchical Peer-to-Peer Network
    Chen, Mei
    Luo, Xin
    Wu, Guowen
    Tan, Yang
    Kita, Kenji
    INTERNATIONAL CONFERENCE ON GRAPHIC AND IMAGE PROCESSING (ICGIP 2012), 2013, 8768
  • [40] RSPCA: Random Sample Partition and Clustering Approximation for ensemble learning of big data
    Mahmud, Mohammad Sultan
    Zheng, Hua
    Garcia-Gil, Diego
    Garcia, Salvador
    Huang, Joshua Zhexue
    PATTERN RECOGNITION, 2025, 161