A comprehensive study of clustering ensemble weighting based on cluster quality and diversity

被引:60
|
作者
Nazari, Ahmad [1 ]
Dehghan, Ayob [1 ]
Nejatian, Samad [2 ,3 ]
Rezaie, Vahideh [3 ,4 ]
Parvin, Hamid [1 ,5 ]
机构
[1] Islamic Azad Univ, Yasooj Branch, Dept Comp Engn, Yasuj, Iran
[2] Islamic Azad Univ, Yasooj Branch, Dept Elect Engn, Yasuj, Iran
[3] Islamic Azad Univ, Yasooj Branch, Young Researchers & Elite Club, Yasuj, Iran
[4] Islamic Azad Univ, Yasooj Branch, Dept Math, Yasuj, Iran
[5] Islamic Azad Univ, Young Researchers & Elite Club, Nourabad Mamasani Branch, Nourabad, Mamasani, Iran
关键词
Data clustering; Clustering ensemble; Consensus function; Weighting; COMBINING MULTIPLE CLUSTERINGS; TRANSFER DISTANCE; SELECTION; CONSENSUS; PARTITIONS;
D O I
10.1007/s10044-017-0676-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering as a major task in data mining is responsible for discovering hidden patterns in unlabeled datasets. Finding the best clustering is also considered as one of the most challenging problems in data mining. Due to the problem complexity and the weaknesses of primary clustering algorithm, a large part of research has been directed toward ensemble clustering methods. Ensemble clustering aggregates a pool of base clusterings and produces an output clustering that is also named consensus clustering. The consensus clustering is usually better clustering than the output clusterings of the basic clustering algorithms. However, lack of quality in base clusterings makes their consensus clustering weak. In spite of some researches in selection of a subset of high quality base clusterings based on a clustering assessment metric, cluster-level selection has been always ignored. In this paper, a new clustering ensemble framework has been proposed based on cluster-level weighting. The certainty amount that the given ensemble has about a cluster is considered as the reliability of that cluster. The certainty amount that the given ensemble has about a cluster is computed by the accretion amount of that cluster by the ensemble. Then by selecting the best clusters and assigning a weight to each selected cluster based on its reliability, the final ensemble is created. After that, the paper proposes cluster-level weighting co-association matrix instead of traditional co-association matrix. Then, two consensus functions have been introduced and used for production of the consensus partition. The proposed framework completely overshadows the state-of-the-art clustering ensemble methods experimentally.
引用
收藏
页码:133 / 145
页数:13
相关论文
共 50 条
  • [41] A hierarchical fuzzy cluster ensemble approach and its application to big data clustering
    Su, Pan
    Shang, Changjing
    Shen, Qiang
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2015, 28 (06) : 2409 - 2421
  • [42] Co-Clustering Ensemble Based on Bilateral K-Means Algorithm
    Yang, Hui
    Peng, Han
    Zhu, Jianyong
    Nie, Feiping
    IEEE ACCESS, 2020, 8 : 51285 - 51294
  • [43] Flexible fuzzy co-clustering with feature-cluster weighting
    Tjhi, William-Chandra
    Chen, Lihui
    2006 9TH INTERNATIONAL CONFERENCE ON CONTROL, AUTOMATION, ROBOTICS AND VISION, VOLS 1- 5, 2006, : 1547 - +
  • [44] A kernel-induced weighted object-cluster association-based ensemble method for educational data clustering
    Chau Thi Ngoc Vo
    Phung Hua Nguyen
    JOURNAL OF INFORMATION AND TELECOMMUNICATION, 2020, 4 (02) : 119 - 139
  • [45] An improved clustering ensemble method based link analysis
    Hao, Zhi-Feng
    Wang, Li-Juan
    Cai, Rui-Chu
    Wen, Wen
    WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS, 2015, 18 (02): : 185 - 195
  • [46] Weighted Delta Factor Cluster Ensemble Algorithm for Categorical Data Clustering in Data Mining
    Sengottaian, Sarumathi
    Natesan, Shanthi
    Mathivanan, Sharmila
    INTERNATIONAL ARAB JOURNAL OF INFORMATION TECHNOLOGY, 2017, 14 (03) : 275 - 284
  • [47] A Method of Clustering Ensemble Based on Grey Relation Analysis
    Shi, Tuo
    Jiang, Wei
    Luo, Ping
    WIRELESS PERSONAL COMMUNICATIONS, 2018, 103 (01) : 871 - 885
  • [48] A Method of Clustering Ensemble Based on Grey Relation Analysis
    Tuo Shi
    Wei Jiang
    Ping Luo
    Wireless Personal Communications, 2018, 103 : 871 - 885
  • [49] A Point-Cluster-Partition Architecture for Weighted Clustering Ensemble
    Li, Na
    Xu, Sen
    Xu, Heyang
    Xu, Xiufang
    Guo, Naixuan
    Cai, Na
    NEURAL PROCESSING LETTERS, 2024, 56 (03)
  • [50] Clustering ensemble based on density peaks
    Chu R.-H.
    Wang H.-J.
    Yang Y.
    Li T.-R.
    Wang, Hong-Jun (wanghongjun@swjtu.edu.cn), 1600, Science Press (42): : 1401 - 1412