The significance of Kappa and F-score in clustering ensemble: a comprehensive analysis

被引:0
作者
Jie Yan [1 ]
Xin Liu [2 ]
Ji Qi [1 ]
Tao You [3 ]
Zhong-Yuan Zhang [1 ]
机构
[1] Central University of Finance and Economics,School of Statistics and Mathematics
[2] Intelligent Science & Technology Academy Limited of CASIC,undefined
[3] China Southern Asset Management Co.,undefined
[4] Ltd,undefined
关键词
Clustering ensemble; Diversity; Stability; Kappa; F-score;
D O I
10.1007/s10115-025-02388-4
中图分类号
学科分类号
摘要
Clustering ensemble techniques have gained significant attention due to their ability to enhance partition results’ accuracy and robustness. Selective clustering ensemble (SCE) and weighted clustering ensemble (WCE) methods further improve performance by selecting and weighting base partitions or clusters based on their diversity and stability. However, striking a balance between these two factors remains challenging. The primary difficulty lies in evaluating the quality of base partitions and clusters. Existing evaluation criteria, such as normalized mutual information (NMI) and its variants, suffer from inherent flaws, including symmetric problem, context meaning problem, and the disregard for small clusters’ importance. To address these limitations, this paper proposes a novel evaluation method that utilizes kappa and F-score. We introduce a new SCE method that employs kappa to select informative base partitions and utilizes F-score to assign weights to clusters based on their stability. Empirical validation on real datasets demonstrates the effectiveness and efficiency of the proposed approach. The code is available at https://github.com/Jarvisyan/DSKF-matlab.
引用
收藏
页码:5377 / 5412
页数:35
相关论文
empty
未找到相关数据