A novel clustering approach using hierarchical genetic algorithms

被引:29
作者
Lai, CC [1 ]
机构
[1] Natl Univ Tainan, Dept Comp Sci & Informat Engn, Tainan 700, Taiwan
关键词
clustering; hierarchical genetic algorithm; Davies-Bouldin index;
D O I
10.1080/10798587.2005.10642900
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Clustering is a central task for data analysis that partitions heterogeneous data sets into groups of more homogeneous characteristics. However, most of clustering algorithms require the user to provide the number of clusters as input. In this paper, we consider the automatic clustering problem that one has to partition data points without any a priori knowledge about the correct number of clusters. The hierarchical genetic algorithm (HGA) is employed for automatically searching the number of clusters as well as properly locating the centers for clusters. The well-known Davies-Bouldin index is adopted as a measure of the validity of the clusters. Experimental results on artificial and real-life data sets are given to illustrate the effectiveness of the proposed approach.
引用
收藏
页码:143 / 153
页数:11
相关论文
共 18 条
[1]  
[Anonymous], 1989, GENETIC ALGORITHM SE
[2]  
[Anonymous], 1991, Handbook of genetic algorithms
[3]   Genetic clustering for automatic evolution of clusters and application to image classification [J].
Bandyopadhyay, S ;
Maulik, U .
PATTERN RECOGNITION, 2002, 35 (06) :1197-1208
[4]   Nonparametric genetic clustering: Comparison of validity indices [J].
Bandyopadhyay, S ;
Maulik, U .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2001, 31 (01) :120-125
[5]   Some new indexes of cluster validity [J].
Bezdek, JC ;
Pal, NR .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 1998, 28 (03) :301-315
[6]  
Blake C.L., 1998, UCI repository of machine learning databases
[7]   CLUSTER SEPARATION MEASURE [J].
DAVIES, DL ;
BOULDIN, DW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1979, 1 (02) :224-227
[8]  
DEJONG KA, 1990, P 1 WORKSH PAR PROBL, P38
[9]   The use of multiple measurements in taxonomic problems [J].
Fisher, RA .
ANNALS OF EUGENICS, 1936, 7 :179-188
[10]   OPTIMIZATION OF CONTROL PARAMETERS FOR GENETIC ALGORITHMS [J].
GREFENSTETTE, JJ .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS, 1986, 16 (01) :122-128