An accelerated K-means clustering algorithm using selection and erasure rules

被引:7
作者
Lee, Suiang-Shyan [1 ]
Lin, Ja-Chen [1 ]
机构
[1] Natl Chiao Tung Univ, Dept Comp Sci, Hsinchu 30050, Taiwan
来源
JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE C-COMPUTERS & ELECTRONICS | 2012年 / 13卷 / 10期
关键词
K-means clustering; Acceleration; Vector quantization; Selection; Erasure;
D O I
10.1631/jzus.C1200078
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The K-means method is a well-known clustering algorithm with an extensive range of applications, such as biological classification, disease analysis, data mining, and image compression. However, the plain K-means method is not fast when the number of clusters or the number of data points becomes large. A modified K-means algorithm was presented by Fahim et al. (2006). The modified algorithm produced clusters whose mean square error was very similar to that of the plain K-means, but the execution time was shorter. In this study, we try to further increase its speed. There are two rules in our method: a selection rule, used to acquire a good candidate as the initial center to be checked, and an erasure rule, used to delete one or many unqualified centers each time a specified condition is satisfied. Our clustering results are identical to those of Fahim et al. (2006). However, our method further cuts computation time when the number of clusters increases. The mathematical reasoning used in our design is included.
引用
收藏
页码:761 / 768
页数:8
相关论文
共 16 条
  • [1] [Anonymous], 2009, INT J CIRCUITS SYSTE
  • [2] A methodology for dynamic data mining based on fuzzy clustering
    Crespo, F
    Weber, R
    [J]. FUZZY SETS AND SYSTEMS, 2005, 150 (02) : 267 - 284
  • [3] Fahim AM., 2006, J ZHEJIANG UNIV-SC A, V7, P1626, DOI [DOI 10.1631/JZUS.2006.A1626, https://doi.org/10.1631/jzus.2006.A1626, 10.1631/jzus.2006.A1626]
  • [4] Frank A., 2010, UCI machine learning repository, V213
  • [5] Multi-face detection based on downsampling and modified subtractive clustering for color images
    Kong Wan-zeng
    Zhu Shan-an
    [J]. JOURNAL OF ZHEJIANG UNIVERSITY-SCIENCE A, 2007, 8 (01): : 72 - 78
  • [6] Vector quantization of images using a fuzzy clustering method
    Lee, Wan-Jui
    Chung, Jun-Shih
    Ouyang, Chen-Sen
    Lee, Shie-Jue
    [J]. CYBERNETICS AND SYSTEMS, 2008, 39 (01) : 45 - 60
  • [7] Mining Outliers in Correlated Subspaces for High Dimensional Data Sets
    Leng, Jinsong
    Hong, Tzung-Pei
    [J]. FUNDAMENTA INFORMATICAE, 2010, 98 (01) : 71 - 86
  • [8] Lin HJ, 2005, J APPL SCI ENG, V8, P113
  • [9] Multi-class clustering by analytical two-class formulas
    Lin, JC
    [J]. INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 1996, 10 (04) : 307 - 323
  • [10] Hierarchical initialization approach for K-Means clustering
    Lu, J. F.
    Tang, J. B.
    Tang, Z. M.
    Yang, J. Y.
    [J]. PATTERN RECOGNITION LETTERS, 2008, 29 (06) : 787 - 795