A fuzzy k-modes algorithm for clustering categorical data

被引:289
作者
Huang, ZX [1 ]
Ng, MK
机构
[1] Management Informat Principles Ltd, Melbourne, Vic, Australia
[2] Univ Hong Kong, Dept Math, Hong Kong, Peoples R China
关键词
categorical data; clustering; data mining; fuzzy partitioning; k -means algorithm;
D O I
10.1109/91.784206
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This correspondence describes extensions to the fuzzy k-means algorithm for clustering categorical data. By using a simple matching dissimilarity measure for categorical objects and modes instead of means for clusters, a new approach is developed, which allows the use of the k-means paradigm to efficiently cluster large categorical data sets. A fuzzy k-modes algorithm is presented and the effectiveness of the algorithm is demonstrated with experimental results.
引用
收藏
页码:446 / 452
页数:7
相关论文
共 18 条
[1]  
Anderberg M. R., 1973, CLUSTER ANAL APPL, DOI [10.1016/C2013-0-06161-0, DOI 10.1016/C2013-0-06161-0]
[2]   A CLUSTERING TECHNIQUE FOR SUMMARIZING MULTIVARIATE DATA [J].
BALL, GH ;
HALL, DJ .
BEHAVIORAL SCIENCE, 1967, 12 (02) :153-&
[4]   SYMBOLIC CLUSTERING USING A NEW DISSIMILARITY MEASURE [J].
GOWDA, KC ;
DIDAY, E .
PATTERN RECOGNITION, 1991, 24 (06) :567-578
[5]   GENERAL COEFFICIENT OF SIMILARITY AND SOME OF ITS PROPERTIES [J].
GOWER, JC .
BIOMETRICS, 1971, 27 (04) :857-&
[6]   LOCAL CONVERGENCE OF THE FUZZY C-MEANS ALGORITHMS [J].
HATHAWAY, RJ ;
BEZDEK, JC .
PATTERN RECOGNITION, 1986, 19 (06) :477-480
[7]  
Huang Z., 1998, DATA MINING KNOWLEDG, V2
[8]   FUZZY C-MEANS - OPTIMALITY OF SOLUTIONS AND EFFECTIVE TERMINATION OF THE ALGORITHM [J].
ISMAIL, MA ;
SELIM, SZ .
PATTERN RECOGNITION, 1986, 19 (06) :481-485
[9]  
Jain A., 1988, Algorithms for Clustering Data
[10]  
Kaufman L., 1990, FINDING GROUPS DATA