Automatic Fuzzy Clustering Using Non-Dominated Sorting Particle Swarm Optimization Algorithm for Categorical Data

被引:13
作者
Thi Phuong Quyen Nguyen [1 ]
Kuo, R. J. [1 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Ind Management, Taipei 106, Taiwan
关键词
Automatic clustering; categorical data; local density; NSPSO; GENETIC ALGORITHM; K-MEANS; C-MEANS;
D O I
10.1109/ACCESS.2019.2927593
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Categorical data clustering has been attracted a lot of attention recently due to its necessary in the real-world applications. Many clustering methods have been proposed for categorical data. However, most of the existing algorithms require the predefined number of clusters which is usually unavailable in real-world problems. Only a few works focused on automatic clustering, but mainly handled for numerical data. This study develops a novel automatic fuzzy clustering using non-dominated sorting particle swarm optimization (AFC-NSPSO) algorithm for categorical data. The proposed AFC-NSPSO algorithm can automatically identify the optimal number of clusters and exploit the clustering result with the corresponding selected number of clusters. In addition, a new technique is investigated to identify the maximum number of clusters in a dataset based on the local density. To select a final solution in the first Pareto front, some internal validation indices are used. The performance of the proposed AFC-NSPSO on the real-world datasets collected from the UCI machine learning repository exhibits effectiveness compared with some other existing automatic categorical clustering algorithms. Besides, this study also applies the proposed algorithm to analyze a real-world case study with an unknown number of clusters.
引用
收藏
页码:99721 / 99734
页数:14
相关论文
共 48 条
[21]   Pareto analysis in multiobjective optimization using the collinearity theorem and scaling method [J].
Kasprzak, EM ;
Lewis, KE .
STRUCTURAL AND MULTIDISCIPLINARY OPTIMIZATION, 2001, 22 (03) :208-218
[22]   Automatic kernel clustering with bee colony optimization algorithm [J].
Kuo, R. J. ;
Huang, Y. D. ;
Lin, Chih-Chieh ;
Wu, Yung-Hung ;
Zulvia, Ferani E. .
INFORMATION SCIENCES, 2014, 283 :107-122
[23]   Integration of particle swarm optimization and genetic algorithm for dynamic clustering [J].
Kuo, R. J. ;
Syu, Y. J. ;
Chen, Zhen-Yao ;
Tien, F. C. .
INFORMATION SCIENCES, 2012, 195 :124-140
[24]  
Kuo R.J., 2013, J IND INTELLIGENT IN, V1, P46, DOI DOI 10.12720/JIII.1.1.46-51
[25]   Bayesian k-Means as a "Maximization-Expectation" Algorithm [J].
Kurihara, Kenichi ;
Welling, Max .
NEURAL COMPUTATION, 2009, 21 (04) :1145-1172
[26]  
Li XD, 2003, LECT NOTES COMPUT SC, V2723, P37
[27]   Categorical Data Clustering with Automatic Selection of Cluster Number [J].
Liao, Hai-Yong ;
Ng, Michael K. .
FUZZY INFORMATION AND ENGINEERING, 2009, 1 (01) :5-25
[28]   Automatic clustering using genetic algorithms [J].
Liu, Yongguo ;
Wu, Xindong ;
Shen, Yidong .
APPLIED MATHEMATICS AND COMPUTATION, 2011, 218 (04) :1267-1279
[29]  
MacQueen J., 1967, P 5 BERK S MATH STAT, V1, P281
[30]   A multiobjective approach to MR brain image segmentation [J].
Mukhopadhyay, Anirban ;
Maulik, Ujjwal .
APPLIED SOFT COMPUTING, 2011, 11 (01) :872-880