A k-populations algorithm for clustering categorical data

被引:22
作者
Kim, DW [1 ]
Lee, K
Lee, D
Lee, KH
机构
[1] Korea Adv Inst Sci & Technol, Dept BioSyst, Taejon 305701, South Korea
[2] Korea Adv Inst Sci & Technol, Adv Informat Technol Res Ctr, Taejon 305701, South Korea
[3] Korea Adv Inst Sci & Technol, Dept Elect Engn & Comp Sci, Taejon 305701, South Korea
关键词
clustering; categorical data; hierarchical algorithm; k-modes algorithm; fuzzy k-modes algorithm;
D O I
10.1016/j.patcog.2004.11.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, the conventional k-modes-type algorithms for clustering categorical data are extended by representing the clusters of categorical data with k-populations instead of the hard-type centroids used in the conventional algorithms. Use of a population-based centroid representation makes it possible to preserve the uncertainty inherent in data sets as long as possible before actual decisions are made. The k-populations algorithm was found to give markedly better clustering results through various experiments. (c) 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
引用
收藏
页码:1131 / 1134
页数:4
相关论文
共 4 条
[1]  
Blake C. L., 1989, UCI REPOSITORY MACHI
[2]   SYMBOLIC CLUSTERING USING A NEW DISSIMILARITY MEASURE [J].
GOWDA, KC ;
DIDAY, E .
PATTERN RECOGNITION, 1991, 24 (06) :567-578
[3]   Extensions to the k-means algorithm for clustering large data sets with categorical values [J].
Huang, ZX .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (03) :283-304
[4]   A fuzzy k-modes algorithm for clustering categorical data [J].
Huang, ZX ;
Ng, MK .
IEEE TRANSACTIONS ON FUZZY SYSTEMS, 1999, 7 (04) :446-452