Improved Fuzzy Clustering Techniques for Categorical Data

被引:0
作者
Saha, Indrajit [1 ]
Maulik, Ujjwal [2 ]
机构
[1] Acad Technol, Dept Informat Technol, Adisaptagram 712121, W Bengal, India
[2] Univ Jadavpur, Dept Comp Sci & Engn, Kolkata, W Bengal, India
来源
IAENG TRANSACTIONS ON ENGINEERING TECHNOLOGIES VOL 1 | 2009年 / 1089卷
关键词
Fuzzy Clustering; Differential Evolution; Genetic Algorithm; Simulated Annealing; Cluster Validity Indices; Statistical significance test; OPTIMIZATION; ALGORITHM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is a widely used technique in data mining application for discovering patterns in underlying data. Most traditional clustering algorithms are limited in handling datasets that contain categorical attributes. Howerver, datasets with categorical types of attributes are common in real life data mining problem. For these data sets, no inherent distance measure, like the Euclidean distance, would work to compute the distance between two catgorical objects. In this article, we have described differential evolution, genetic algorithm and simulated annealing based fuzzy clustering. The performance of the proposed algorithms have been compared with that of different well known categorical data clustering algorithms and demonstrated for a variety of artificial and real life categorical data sets. Statistical significance test has been performed to establish the superiority of the proposed algorithms.
引用
收藏
页码:82 / +
页数:2
相关论文
共 20 条
[1]  
[Anonymous], P IEEE INT FUZZ SYST
[2]  
[Anonymous], 1995, Tech. Rep. TR-95-012
[3]  
Ben-Hur A., 2003, DETECTING STABLE CLU
[4]  
Bezdek J.C., 1981, PATTERN RECOGNITION
[5]  
DEVIJVER PA, 1982, PATTERN RECOGNITION
[6]  
Goldberg D. E., 1989, Genetic algorithms in machine learning, search and optimization
[7]  
Hartigan J.A., 1975, CLUSTERING ALGORITHM
[8]   A fuzzy k-modes algorithm for clustering categorical data [J].
Huang, ZX ;
Ng, MK .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) :446-452
[9]   OPTIMIZATION BY SIMULATED ANNEALING [J].
KIRKPATRICK, S ;
GELATT, CD ;
VECCHI, MP .
SCIENCE, 1983, 220 (4598) :671-680
[10]   Performance evaluation of some clustering algorithms and validity indices [J].
Maulik, U ;
Bandyopadhyay, S .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2002, 24 (12) :1650-1654