PCA-guided search for K-means

被引:42
|
作者
Xu, Qin [1 ]
Ding, Chris [2 ]
Liu, Jinpei [3 ]
Luo, Bin [1 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
[2] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[3] Anhui Univ, Sch Business, Hefei 730601, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
K-means; Principal component analysis; Cluster centroid initialization; Clustering; ALGORITHM;
D O I
10.1016/j.patrec.2014.11.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to the non-convexity of the model formulations, expectation-maximization (EM) type algorithms converge to different local optima with different initializations. Recent discoveries have identified that the global solution of K-means cluster centroids lies in the principal component analysis (PCA) subspace. Based on this insight, we propose PCA-guided effective search for K-means. Because the PCA subspace is much smaller than the original space, searching in the PCA subspace is both more effective and efficient. Extensive experiments on four real world data sets and systematic comparison with previous algorithms demonstrate that our proposed method outperforms the rest as it makes the K-means more effective. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:50 / 55
页数:6
相关论文
共 50 条
  • [1] PCA-guided k-Means Clustering With Incomplete Data
    Honda, Katsuhiro
    Nonoguchi, Ryoichi
    Notsu, Akira
    Ichihashi, Hidetomo
    IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ 2011), 2011, : 1710 - 1714
  • [2] Fuzzy PCA-Guided Robust k-Means Clustering
    Honda, Katsuhiro
    Notsu, Akira
    Ichihashi, Hidetomo
    IEEE TRANSACTIONS ON FUZZY SYSTEMS, 2010, 18 (01) : 67 - 79
  • [3] Variable Weighting in PCA-Guided k-Means and its Connection with Information Summarization
    Honda, Katsuhiro
    Notsu, Akira
    Ichihashi, Hidetomo
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2011, 15 (01) : 83 - 89
  • [4] In Search of Star Clusters: An Introduction to the K-Means Algorithm
    Ferreira Nascimento, Marcio Luis
    JOURNAL OF HUMANISTIC MATHEMATICS, 2022, 12 (01): : 243 - 255
  • [5] Preconditioned Data Sparsification for Big Data With Applications to PCA and K-Means
    Pourkamali-Anaraki, Farhad
    Becker, Stephen
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2017, 63 (05) : 2954 - 2974
  • [6] A local search approximation algorithm for k-means clustering
    Kanungo, T
    Mount, DM
    Netanyahu, NS
    Piatko, CD
    Silverman, R
    Wu, AY
    COMPUTATIONAL GEOMETRY-THEORY AND APPLICATIONS, 2004, 28 (2-3): : 89 - 112
  • [7] A local search algorithm for k-means with outliers q
    Zhang, Zhen
    Feng, Qilong
    Huang, Junyu
    Guo, Yutian
    Xu, Jinhui
    Wang, Jianxin
    NEUROCOMPUTING, 2021, 450 : 230 - 241
  • [8] Network Traffic Prediction using PCA and K-means
    Holanda Filho, Raimir
    Bessa Maia, Jose Everardo
    PROCEEDINGS OF THE 2010 IEEE-IFIP NETWORK OPERATIONS AND MANAGEMENT SYMPOSIUM, 2010, : 938 - 941
  • [9] PCA-guided Fuzzy Cluster Validation With Noise Rejection
    Honda, Katsuhiro
    Notsu, Akira
    Matsui, Tomohiro
    Ichihashi, Hidetomo
    2010 IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS (FUZZ-IEEE 2010), 2010,
  • [10] Horizontal SCA Attacks against kP Algorithm Using K-Means and PCA
    Aftowicz, Marcin
    Kabin, Ievgen
    Klann, Dan
    Varabei, Yauhen
    Dyka, Zoya
    Langendoerfer, Peter
    2020 9TH MEDITERRANEAN CONFERENCE ON EMBEDDED COMPUTING (MECO), 2020, : 109 - 115