ac Clustering categorical data using silhouette coefficient as a relocating measure

被引:0
作者
Aranganayagi, S. [1 ]
Thangavel, K. [2 ]
机构
[1] JKK Nataraja Coll Arts & Sci, Komarapalayam 638183, India
[2] Periyar Univ, Dept Comp Sci, Madras 636011, Tamil Nadu, India
来源
ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL II, PROCEEDINGS | 2007年
关键词
data mining; clustering; categorical data; silhouette coefficient;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cluster analysis is an unsupervised learning method that constitutes a cornerstone of an intelligent data analysis process. Clustering categorical data is an important research area data mining. In this paper we propose a novel algorithm to cluster categorical data. Based on the minimum dissimilarity value objects are grouped into cluster. In the merging process, the objects are relocated using silhouette coefficient. Experimental results show that the proposed method is efficient.
引用
收藏
页码:13 / +
页数:3
相关论文
共 13 条
[1]  
[Anonymous], 2011, Pei. data mining concepts and techniques
[2]  
ARANGANAYAGI S, 2005, NOVEL CLUSTERING ALG
[3]  
Barbara D., 2002, Proceedings of the Eleventh International Conference on Information and Knowledge Management. CIKM 2002, P582, DOI 10.1145/584792.584888
[4]  
Berkhin P., 2002, SURVEY CLUSTERING DA
[5]  
Ganti V., 1999, P ACM SIGKDD INT C K
[6]  
HE ZY, K HISTOGRAMS EFFICIE
[7]  
Huang Z., 1997, RES ISSUES DATA MINI
[8]   Extensions to the k-means algorithm for clustering large data sets with categorical values [J].
Huang, ZX .
DATA MINING AND KNOWLEDGE DISCOVERY, 1998, 2 (03) :283-304
[9]  
KARYPIS G, 1999, CHAMELEON HIERARCHIA
[10]  
Pujari A.K., 2001, Data mining techniques