A Clustering Validity Index Based on Pairing Frequency

被引:5
作者
Cui, Hongyan [1 ,2 ,3 ,5 ]
Zhang, Kuo [1 ,2 ,3 ]
Fang, Yajun [7 ]
Sobolevsky, Stanislav [4 ,5 ,6 ]
Ratti, Carlo [5 ]
Horn, Berthold K. P. [7 ]
机构
[1] Beijing Univ Posts & Telecommun, State Key Lab Networking & Switching Technol, Beijing 100876, Peoples R China
[2] Beijing Lab Adv Informat Networks, Beijing 100876, Peoples R China
[3] Key Lab Network Syst Architecture & Convergence, Beijing 100876, Peoples R China
[4] NYU, Ctr Urban Sci & Progress, Brooklyn, NY 10003 USA
[5] MIT, Senseable City Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[6] Natl Res Univ ITMO, Inst Design & Urban Studies, St Petersburg 197101, Russia
[7] MIT, CSAIL Lab, 77 Massachusetts Ave, Cambridge, MA 02139 USA
来源
IEEE ACCESS | 2017年 / 5卷
基金
中国国家自然科学基金;
关键词
Pairwise pattern; clustering validity; clustering analysis; fuzzy c-means; FUZZY;
D O I
10.1109/ACCESS.2017.2743985
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is an important problem, which has been applied in many research areas. However, there is a large variety of clustering algorithms and each could produce quite different results depending on the choice of algorithm and input parameters, so how to evaluate clustering quality and find out the optimal clustering algorithm is important. Various clustering validity indices are proposed under this background. Traditional clustering validity indices can be divided into two categories: internal and external. The former is mostly based on compactness and separation of data points, which is measured by the distance between clusters' centroids, ignoring the shape and density of clusters. The latter needs external information, which is unavailable in most cases. In this paper, we propose a new clustering validity index for both fuzzy and hard clustering algorithms. Our new index uses pairwise pattern information from a certain number of interrelated clustering results, which focus more on logical reasoning than geometrical features. The proposed index overcomes some shortcomings of traditional indices. Experiments show that the proposed index performs better compared with traditional indices on the artificial and real datasets. Furthermore, we applied the proposed method to solve two existing problems in telecommunication fields. One is to cluster serving GPRS support nodes in the city Chongqing based on service characteristics, the other is to analyze users' preference.
引用
收藏
页码:24977 / 24987
页数:11
相关论文
共 32 条
[1]  
[Anonymous], Pattern Recognition with Fuzzy Objective Function Algorithms, DOI 10.1007/978-1-4757-0450-1_3
[2]  
Asuncion A., 2007, Tech. Rep.
[3]  
Bezdek J., 1999, FUZZY MODELS ALGORIT
[4]  
Bezdek J.C., 1973, Cluster validity with fuzzy sets, P58
[5]   NUMERICAL TAXONOMY WITH FUZZY SETS [J].
BEZDEK, JC .
JOURNAL OF MATHEMATICAL BIOLOGY, 1974, 1 (01) :57-71
[7]   An objective approach to cluster validation [J].
Bouguessa, Mohamed ;
Wang, Shengrui ;
Sun, Haojun .
PATTERN RECOGNITION LETTERS, 2006, 27 (13) :1419-1430
[8]   Model-based evaluation of clustering validation measures [J].
Brun, Marcel ;
Sima, Chao ;
Hua, Jianping ;
Lowey, James ;
Carroll, Brent ;
Suh, Edward ;
Dougherty, Edward R. .
PATTERN RECOGNITION, 2007, 40 (03) :807-824
[9]  
Calvo T, 2002, STUD FUZZ SOFT COMP, V97, P3
[10]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227