Categorical fuzzy k-modes clustering with automated feature weight learning

被引:22
作者
Saha, Arkajyoti [1 ]
Das, Swagatam [2 ]
机构
[1] Indian Stat Inst, Stat Math Unit, Kolkata 700108, India
[2] Indian Stat Inst, Elect & Commun Sci Unit, Kolkata 700108, India
关键词
Fuzzy clustering; WFk-modes; Fuzzy K-modes; Automated feature weights; Categorical data; ALGORITHM; INFORMATION;
D O I
10.1016/j.neucom.2015.03.037
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents and investigates a new variant of the fuzzy k-Modes clustering algorithm for categorical data with automated feature weight learning. The modification strengthens the classical fuzzy k-Modes algorithm by associating higher weights to features which are instrumental in recognizing the clustering pattern of the data. A statistical comparison between the performances of the proposed algorithm and the conventional fuzzy k-Modes algorithm on synthetic and real world datasets, have been carried out with respect to mean values, best performance count, and medians. We take a novel approach towards the comparison of the fuzziness of the obtained clusters. To the best of our knowledge, such comparison has been reported here for the first time for the case of categorical data. The results obtained, shows that the proposed algorithm enjoys an edge over the conventional fuzzy k-Modes algorithm both in terms of Rand Index and fuzziness measures. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:422 / 435
页数:14
相关论文
共 24 条
[1]   The k-modes type clustering plus between-cluster information for categorical data [J].
Bai, Liang ;
Liang, Jiye .
NEUROCOMPUTING, 2014, 133 :111-121
[2]   A novel fuzzy clustering algorithm with between-cluster information for categorical data [J].
Bai, Liang ;
Liang, Jiye ;
Dang, Chuangyin ;
Cao, Fuyuan .
FUZZY SETS AND SYSTEMS, 2013, 215 :55-73
[3]  
Bezdek J. C., 1973, Journal of Cybernetics, V3, P58, DOI 10.1080/01969727308546047
[4]  
Bezdek J. C, 1975, PROC 8 INT C NUMER, P143
[5]   STATISTICAL PARAMETERS OF CLUSTER VALIDITY FUNCTIONALS [J].
BEZDEK, JC ;
WINDHAM, MP ;
EHRLICH, R .
INTERNATIONAL JOURNAL OF COMPUTER & INFORMATION SCIENCES, 1980, 9 (04) :323-336
[6]   A weighting k-modes algorithm for subspace clustering of categorical data [J].
Cao, Fuyuan ;
Liang, Jiye ;
Li, Deyu ;
Zhao, Xingwang .
NEUROCOMPUTING, 2013, 108 :23-30
[7]   A dissimilarity measure for the k-Modes clustering algorithm [J].
Cao, Fuyuan ;
Liang, Jiye ;
Li, Deyu ;
Bai, Liang ;
Dang, Chuangyin .
KNOWLEDGE-BASED SYSTEMS, 2012, 26 :120-127
[8]   SYNTHESIZED CLUSTERING - A METHOD FOR AMALGAMATING ALTERNATIVE CLUSTERING BASES WITH DIFFERENTIAL WEIGHTING OF VARIABLES [J].
DESARBO, WS ;
CARROLL, JD ;
CLARK, LA ;
GREEN, PE .
PSYCHOMETRIKA, 1984, 49 (01) :57-78
[9]  
Dunn JC., 1977, FUZZY AUTOMATA DECIS
[10]   A genetic fuzzy k-Modes algorithm for clustering categorical data [J].
Gan, G. ;
Wu, J. ;
Yang, Z. .
EXPERT SYSTEMS WITH APPLICATIONS, 2009, 36 (02) :1615-1620