A novel member enhancement-based clustering ensemble algorithm

被引:0
作者
He, Yulin [1 ,2 ,3 ]
Yang, Jin [2 ]
Cheng, Yingchao [1 ]
Du, Xueqin [2 ]
Huang, Joshua Zhexue [1 ,2 ]
机构
[1] Guangdong Lab Artificial Intelligence & Digital Ec, Shenzhen, Peoples R China
[2] Shenzhen Univ, Coll Comp Sci & Software Engn, Shenzhen, Peoples R China
[3] Guangdong Lab Artificial Intelligence & Digital Ec, Shenzhen 518107, Peoples R China
基金
中国国家自然科学基金;
关键词
ensemble clustering; heterocluster; homocluster; MMD; neighborhood density; COMBINING MULTIPLE CLUSTERINGS; SELECTION; PARTITIONS; STABILITY; QUALITY;
D O I
10.1002/cpe.7992
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Clustering ensemble is a popular approach for identifying data clusters that combines the clustering results from multiple base clustering algorithms to produce more accurate and robust data clusters. However, the performance of clustering ensemble algorithms is highly dependent on the quality of clustering members. To address this problem, this paper proposes a member enhancement-based clustering ensemble (MECE) algorithm that selects the ensemble members by considering their distribution consistency. MECE has two main components, called heterocluster splitting and homocluster merging. The first component estimates two probability density functions (p.d.f.s) estimated on the sample points of an heterocluster and represents them using a Gaussian distribution and a Gaussian mixture model. If the random numbers generated by these two p.d.f.s have different probability distributions, the heterocluster is then split into smaller clusters. The second component merges the clusters that have high neighborhood densities into a homocluster, where the neighborhood density is measured using a novel evaluation criterion. In addition, a co-association matrix is presented, which serves as a summary for the ensemble of diverse clusters. A series of experiments were conducted to evaluate the feasibility and effectiveness of the proposed ensemble member generation algorithm. Results show that the proposed MECE algorithm can select high quality ensemble members and as a result yield the better clusterings than six state-of-the-art ensemble clustering algorithms, that is, cluster-based similarity partitioning algorithm (CSPA), meta-clustering algorithm (MCLA), hybrid bipartite graph formulation (HBGF), evidence accumulation clustering (EAC), locally weighted evidence accumulation (LWEA), and locally weighted graph partition (LWGP). Specifically, MECE algorithm has the nearly 23% higher average NMI, 27% higher average ARI, 15% higher average FMI, and 10% higher average purity than CSPA, MCLA, HBGF, EAC, LWEA, and LWGA algorithms. The experimental results demonstrate that MECE algorithm is a valid approach to deal with the clustering ensemble problems.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] A Novel Clustering Algorithm Based on Information Geometry for Cooperative Spectrum Sensing
    Zhang, Shunchao
    Wang, Yonghua
    Zhang, Yongwei
    Wan, Pin
    Zhuang, Jiawei
    IEEE SYSTEMS JOURNAL, 2021, 15 (02): : 3121 - 3130
  • [32] A Novel Multiple Classifier Generation and Combination Framework Based on Fuzzy Clustering and Individualized Ensemble Construction
    Gao, Zhen
    Zand, Maryam
    Ruan, Jianhua
    2019 IEEE INTERNATIONAL CONFERENCE ON DATA SCIENCE AND ADVANCED ANALYTICS (DSAA 2019), 2019, : 231 - 240
  • [33] A novel auto-pruned ensemble clustering via SOCP
    Ucuncu, Duygu
    Akyuz, Sureyya
    Gul, Erdal
    CENTRAL EUROPEAN JOURNAL OF OPERATIONS RESEARCH, 2024, 32 (03) : 819 - 841
  • [34] A Novel Hybrid Ensemble Clustering Technique for Student Performance Prediction
    Fida, Sanam
    Masood, Nayyer
    Tariq, Nirmal
    Qayyum, Faiza
    JOURNAL OF UNIVERSAL COMPUTER SCIENCE, 2022, 28 (08) : 777 - 798
  • [35] Ensemble Clustering via Co-Association Matrix Self-Enhancement
    Jia, Yuheng
    Tao, Sirui
    Wang, Ran
    Wang, Yongheng
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (08) : 11168 - 11179
  • [36] AdaUK-Means: An Ensemble Boosting Clustering Algorithm on Uncertain Objects
    Xu, Lei
    Hu, Qinghua
    Zhang, Xisheng
    Chen, Yanshuo
    Liao, Changrui
    PATTERN RECOGNITION (CCPR 2016), PT I, 2016, 662 : 27 - 41
  • [37] The influences of PLA into PMMA on crystallinity and thermal properties enhancement-based hybrid polymer in gel properties
    Mazuki, N. F.
    Nagao, Y.
    Kufian, M. Z.
    Samsudin, A. S.
    MATERIALS TODAY-PROCEEDINGS, 2022, 49 : 3105 - 3111
  • [38] Dual-level clustering ensemble algorithm with three consensus strategies
    Shan, Yunxiao
    Li, Shu
    Li, Fuxiang
    Cui, Yuxin
    Chen, Minghua
    SCIENTIFIC REPORTS, 2023, 13 (01)
  • [39] An improved weighted ensemble clustering based on two-tier uncertainty measurement
    Gu, Qinghua
    Wang, Yan
    Wang, Peipei
    Li, Xuexian
    Chen, Lu
    Xiong, Neal N.
    Liu, Di
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 238
  • [40] INCORPORATING STABILITY AND ERROR-BASED CONSTRAINTS FOR A NOVEL PARTITIONAL CLUSTERING ALGORITHM
    Aparna, K.
    Nair, Mydhili K.
    INTERNATIONAL JOURNAL OF TECHNOLOGY, 2016, 7 (04) : 691 - 700