Coercion: A Distributed Clustering Algorithm for Categorical Data

被引:0
作者
Wang, Bin [1 ]
Zhou, Yang [1 ]
Hei, Xinhong [1 ]
机构
[1] Xian Univ Technol, Sch Comp Sci & Engn, Xian, Peoples R China
来源
2013 9TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS) | 2013年
关键词
categorical data; Sqeezer; clustering; distributed; data mining;
D O I
10.1109/CIS.2013.149
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Clustering is an important technology in data mining. Squeezer is one such clustering algorithm for categorical data and it is more efficient than most existing algorithms for categorical data. But Squeezer is time consuming for very large datasets which are distributed in different servers. Thus, we employ the distributed thinking to improve Squeezer and a distributed algorithm for categorical data called Coercion is proposed in this paper. In order to present detailed complexity results for Coercion, we also conduct an experimental study with standard as well as synthetic data sets to demonstrate the effectiveness of the new algorithm.
引用
收藏
页码:683 / 687
页数:5
相关论文
共 13 条
[1]  
Ankerst M, 1999, SIGMOD RECORD, VOL 28, NO 2 - JUNE 1999, P49
[2]   A dissimilarity measure for the k-Modes clustering algorithm [J].
Cao, Fuyuan ;
Liang, Jiye ;
Li, Deyu ;
Bai, Liang ;
Dang, Chuangyin .
KNOWLEDGE-BASED SYSTEMS, 2012, 26 :120-127
[3]   MATRIX MULTIPLICATION VIA ARITHMETIC PROGRESSIONS [J].
COPPERSMITH, D ;
WINOGRAD, S .
JOURNAL OF SYMBOLIC COMPUTATION, 1990, 9 (03) :251-280
[4]  
Duda R., 1973, PATTERN CLASSIFICATI, P103
[5]  
Ester M., 1996, DENSITY BASED ALGORI, DOI DOI 10.5555/3001460.3001507
[6]  
Guha S., 1998, SIGMOD Record, V27, P73, DOI 10.1145/276305.276312
[7]   ROCK: A robust clustering algorithm for categorical attributes [J].
Guha, S ;
Rastogi, R ;
Shim, K .
15TH INTERNATIONAL CONFERENCE ON DATA ENGINEERING, PROCEEDINGS, 1999, :512-521
[8]   Squeezer: An efficient algorithm for clustering categorical data [J].
He, ZY ;
Xu, XF ;
Deng, SC .
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY, 2002, 17 (05) :611-624
[9]   A Link-Based Cluster Ensemble Approach for Categorical Data Clustering [J].
Iam-On, Natthakan ;
Boongoen, Tossapon ;
Garrett, Simon ;
Price, Chris .
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2012, 24 (03) :413-425
[10]  
Macqueen J., 1967, 5 BERK S MATH STAT P, P281, DOI DOI 10.1007/S11665-016-2173-6