How many clusters? A robust PSO-based local density model

被引:34
作者
Ling, Hui-Liang [1 ]
Wu, Jian-Sheng [1 ,2 ]
Zhou, Yi [3 ]
Zheng, Wei-Shi [1 ]
机构
[1] Sun Yat Sen Univ, Sch Data & Comp Sci, Guangzhou 510006, Guangdong, Peoples R China
[2] Nanchang Univ, Informat Engn Sch, Nanchang 330038, Peoples R China
[3] Sun Yat Sen Univ, Zhongshan Sch Med, Dept Biomed Engn, Guangzhou 510080, Guangdong, Peoples R China
基金
中国国家自然科学基金;
关键词
Clustering; Estimation of number of clusters; Local density; PSO; PARTICLE SWARM OPTIMIZATION; ALGORITHM; SEARCH;
D O I
10.1016/j.neucom.2016.03.071
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While most clustering methods assume that the number of data clusters is known, automatically estimating the number of clusters by algorithm itself is still a challenging problem in the data clustering field. In this paper, we aim to develop a novel local and not differentiable clustering method based on Particle Swarm Optimization, which can estimate the number of clusters automatically. In particular, the proposed approach measures the local compactness of each cluster by local density function, pushes the PSO towards maximizing such a compactness, and penalizes the whole procedure to avoid estimating quite a lot of clusters during the evolution. The compactness modeling makes the clustering robust to outliers and noise. In addition, due to the merit of PSO, although kernel trick is used in our modeling, it does not consume too much memory when more and more data are processed. The evaluation on the synthetic dataset and the five publicly, available datasets shows that our algorithm can estimate the appropriate number of clusters and outperforms six related state-of-the-art clustering methods that can also estimate the number of clusters. (C) 2016 Elsevier B.V. All rights reserved.
引用
收藏
页码:264 / 275
页数:12
相关论文
共 49 条
[1]   Research on particle swarm optimization based clustering: A systematic review of literature and techniques [J].
Alam, Shafiq ;
Dobbie, Gillian ;
Koh, Yun Sing ;
Riddle, Patricia ;
Rehman, Saeed Ur .
SWARM AND EVOLUTIONARY COMPUTATION, 2014, 17 :1-13
[2]   A point symmetry-based clustering technique for automatic evolution of clusters [J].
Bandyopadhyay, Sanghamitra ;
Saha, Sriparna .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (11) :1441-1457
[3]   Support vector clustering [J].
Ben-Hur, A ;
Horn, D ;
Siegelmann, HT ;
Vapnik, V .
JOURNAL OF MACHINE LEARNING RESEARCH, 2002, 2 (02) :125-137
[4]  
Berkhin P, 2006, GROUPING MULTIDIMENSIONAL DATA: RECENT ADVANCES IN CLUSTERING, P25
[5]   Multiniche Crowding in Genetic Algorithms and Its Application to the Assembly of DNA Restriction-Fragments [J].
Cedeno, Walter ;
Vemuri, V. Rao ;
Slezak, Tom .
EVOLUTIONARY COMPUTATION, 1994, 2 (04) :321-345
[6]   MEAN SHIFT, MODE SEEKING, AND CLUSTERING [J].
CHENG, YZ .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1995, 17 (08) :790-799
[7]  
Cohen SCM, 2006, IEEE C EVOL COMPUTAT, P1777
[8]  
Das S., IEEE T SYST MAN CY A, V38
[9]   Automatic kernel clustering with a Multi-Elitist Particle Swarm Optimization Algorithm [J].
Das, Swagatam ;
Abraham, Ajith ;
Konar, Amit .
PATTERN RECOGNITION LETTERS, 2008, 29 (05) :688-699
[10]   Where are the niches? Dynamic fitness sharing [J].
Della Cioppa, Antonio ;
De Stefano, Claudio ;
Marcelli, Angelo .
IEEE TRANSACTIONS ON EVOLUTIONARY COMPUTATION, 2007, 11 (04) :453-465