A Novel Cluster Validity Index Based on Local Cores

被引:91
作者
Cheng, Dongdong [1 ]
Zhu, Qingsheng [1 ]
Huang, Jinlong [2 ]
Wu, Quanwang [1 ]
Yang, Lijun [3 ]
机构
[1] Chongqing Univ, Dept Comp Sci, Chongqing 400044, Peoples R China
[2] Yangtze Normal Univ, Coll Comp Engn, Chongqing 408000, Peoples R China
[3] Southwest Minzu Univ, Sch Comp Sci & Technol, Chengdu 610041, Sichuan, Peoples R China
关键词
Clustering analysis; clustering validity index; hierarchical clustering; local cores; OPTIMAL NUMBER; ALGORITHMS; VALIDATION; SEARCH; FIND;
D O I
10.1109/TNNLS.2018.2853710
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
It is critical to evaluate the quality of clusters for most cluster analysis. A number of cluster validity indexes have been proposed, such as the Silhouette and Davies-Bouldin indexes. However, these validity indexes cannot be used to process clusters with arbitrary shapes. Some researchers employ graph-based distance to cluster nonspherical data sets, but the computation of graph-based distances between all pairs of points in a data set is time-consuming. A potential solution is to select some representative points. Inspired by this idea, we propose a novel Local Cores-based Cluster Validity (LCCV) index to improve the performance of Silhouette index. Local cores, with local maximum density, are selected as representative points. Since graph-based distance is used to evaluate the dissimilarity between local cores, the LCCV index is effective for obtaining the optimal cluster number for data sets containing clusters with arbitrary shapes. Moreover, a hierarchical clustering algorithm based on the LCCV index is proposed. The experimental results on synthetic and real data sets indicate that the new index outperforms existing ones.
引用
收藏
页码:985 / 999
页数:15
相关论文
共 43 条
[1]  
[Anonymous], P INT C IND INF COMP
[2]  
[Anonymous], 1996, SIGMOD REC ACM SPEC, DOI DOI 10.1145/235968.233324
[3]  
[Anonymous], 2009, INT J GEOMATH, DOI DOI 10.1007/S13137-020-00149-9
[4]   MULTIDIMENSIONAL BINARY SEARCH TREES USED FOR ASSOCIATIVE SEARCHING [J].
BENTLEY, JL .
COMMUNICATIONS OF THE ACM, 1975, 18 (09) :509-517
[5]  
Bohm C., 2010, P 16 ACM SIGKDD INT, P583
[6]   Efficient agglomerative hierarchical clustering [J].
Bouguettaya, Athman ;
Yu, Qi ;
Liu, Xumin ;
Zhou, Xiangmin ;
Song, Andy .
EXPERT SYSTEMS WITH APPLICATIONS, 2015, 42 (05) :2785-2797
[7]  
Caliski T., 1974, Commun Stat Simul Comput, V3, P1, DOI [10.1080/03610927408827101, DOI 10.1080/03610927408827101]
[8]   A Novel Approach to the Problem of Non-uniqueness of the Solution in Hierarchical Clustering [J].
Cattinelli, Isabella ;
Valentini, Giorgio ;
Paulesu, Eraldo ;
Borghese, Nunzio Alberto .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (07) :1166-1173
[9]   Hierarchical method for determining the number of clusters [J].
Chen, Li-Fei ;
Jiang, Qing-Shan ;
Wang, Sheng-Rui .
Ruan Jian Xue Bao/Journal of Software, 2008, 19 (01) :62-72
[10]   Parallel Spectral Clustering in Distributed Systems [J].
Chen, Wen-Yen ;
Song, Yangqiu ;
Bai, Hongjie ;
Lin, Chih-Jen ;
Chang, Edward Y. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2011, 33 (03) :568-586