A multi-objective genetic algorithm with fuzzy c-means for automatic data clustering

被引:72
作者
Wikaisuksakul, Siripen [1 ]
机构
[1] Prince Songkla Univ, Fac Sci & Technol, Dept Math & Comp Sci, Muang 94000, Pattani, Thailand
关键词
Clustering; Multiobjective optimization; Fuzzy clustering; Genetic algorithms; INDEX;
D O I
10.1016/j.asoc.2014.08.036
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents a multi-objective genetic algorithm which considers the problem of data clustering. A given dataset is automatically assigned into a number of groups in appropriate fuzzy partitions through the fuzzy c-means method. This work has tried to exploit the advantage of fuzzy properties which provide capability to handle overlapping clusters. However, most fuzzy methods are based on compactness and/or separation measures which use only centroid information. The calculation from centroid information only may not be sufficient to differentiate the geometric structures of clusters. The overlap-separation measure using an aggregation operation of fuzzy membership degrees is better equipped to handle this drawback. For another key consideration, we need a mechanism to identify appropriate fuzzy clusters without prior knowledge on the number of clusters. From this requirement, an optimization with single criterion may not be feasible for different cluster shapes. A multi-objective genetic algorithm is therefore appropriate to search for fuzzy partitions in this situation. Apart from the overlap-separation measure, the well-known fuzzy J(m) index is also optimized through genetic operations. The algorithm simultaneously optimizes the two criteria to search for optimal clustering solutions. A string of real-coded values is encoded to represent cluster centers. A number of strings with different lengths varied over a range correspond to variable numbers of clusters. These real-coded values are optimized and the Pareto solutions corresponding to a tradeoff between the two objectives are finally produced. As shown in the experiments, the approach provides promising solutions in well-separated, hyperspherical and overlapping clusters from synthetic and real-life data sets. This is demonstrated by the comparison with existing single-objective and multi-objective clustering techniques. (C) 2014 Elsevier B.V. All rights reserved.
引用
收藏
页码:679 / 691
页数:13
相关论文
共 34 条
[1]  
[Anonymous], 2001, Pattern Classification
[2]   Genetic clustering for automatic evolution of clusters and application to image classification [J].
Bandyopadhyay, S ;
Maulik, U .
PATTERN RECOGNITION, 2002, 35 (06) :1197-1208
[3]   A point symmetry-based clustering technique for automatic evolution of clusters [J].
Bandyopadhyay, Sanghamitra ;
Saha, Sriparna .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2008, 20 (11) :1441-1457
[4]   GAPS: A clustering method using a new point symmetry-based distance measure [J].
Bandyopadhyay, Sanghamitra ;
Saha, Sriparna .
PATTERN RECOGNITION, 2007, 40 (12) :3430-3451
[5]  
Bandyopadhyay Sanghamitra, 2007, P1, DOI 10.1007/3-540-49607-6_1
[6]   FCM - THE FUZZY C-MEANS CLUSTERING-ALGORITHM [J].
BEZDEK, JC ;
EHRLICH, R ;
FULL, W .
COMPUTERS & GEOSCIENCES, 1984, 10 (2-3) :191-203
[7]  
Capitaine H.L., 2008, LECT NOTES COMP SCI, V5342, P622, DOI DOI 10.1007/978-3-540-89689-0_66
[8]  
CAPITAINE HL, 2011, IEEE T FUZZY SYST, V19, P580
[9]   A genetic algorithm with gene rearrangement for K-means clustering [J].
Chang, Dong-Xia ;
Zhang, Xian-Da ;
Zheng, Chang-Wen .
PATTERN RECOGNITION, 2009, 42 (07) :1210-1222
[10]  
Deb, 1994, EVOLUTIONARY COMPUTA, V2, P221, DOI DOI 10.1162/EVCO.1994.2.3.221