PCA-guided search for K-means

被引:42
作者
Xu, Qin [1 ]
Ding, Chris [2 ]
Liu, Jinpei [3 ]
Luo, Bin [1 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
[2] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[3] Anhui Univ, Sch Business, Hefei 730601, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
K-means; Principal component analysis; Cluster centroid initialization; Clustering; ALGORITHM;
D O I
10.1016/j.patrec.2014.11.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to the non-convexity of the model formulations, expectation-maximization (EM) type algorithms converge to different local optima with different initializations. Recent discoveries have identified that the global solution of K-means cluster centroids lies in the principal component analysis (PCA) subspace. Based on this insight, we propose PCA-guided effective search for K-means. Because the PCA subspace is much smaller than the original space, searching in the PCA subspace is both more effective and efficient. Extensive experiments on four real world data sets and systematic comparison with previous algorithms demonstrate that our proposed method outperforms the rest as it makes the K-means more effective. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:50 / 55
页数:6
相关论文
共 50 条
  • [31] EFFECTIVE INITIALIZATION OF K-MEANS FOR COLOR QUANTIZATION
    Celebi, M. Emre
    2009 16TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOLS 1-6, 2009, : 1649 - 1652
  • [32] Optimization of K-means clustering method using hybrid capuchin search algorithm
    Amjad Qtaish
    Malik Braik
    Dheeb Albashish
    Mohammad T. Alshammari
    Abdulrahman Alreshidi
    Eissa Jaber Alreshidi
    The Journal of Supercomputing, 2024, 80 : 1728 - 1787
  • [33] Optimization of K-means clustering method using hybrid capuchin search algorithm
    Qtaish, Amjad
    Braik, Malik
    Albashish, Dheeb
    Alshammari, Mohammad T. T.
    Alreshidi, Abdulrahman
    Alreshidi, Eissa Jaber
    JOURNAL OF SUPERCOMPUTING, 2024, 80 (02) : 1728 - 1787
  • [34] Optimization of Clustering process for WSN with Hybrid Harmony search and K-means algorithm
    Raval, Dharmanshu
    Raval, Gaurang
    Valiveti, Sharada
    2016 5TH INTERNATIONAL CONFERENCE ON RECENT TRENDS IN INFORMATION TECHNOLOGY (ICRTIT), 2016,
  • [35] Sparse probabilistic K-means
    Jung, Yoon Mo
    Whang, Joyce Jiyoung
    Yun, Sangwoon
    APPLIED MATHEMATICS AND COMPUTATION, 2020, 382
  • [36] Adaptive Graph K-Means
    Pei, Shenfei
    Sun, Yuanchen
    Nie, Feiping
    Jiang, Xudong
    Zheng, Zengwei
    PATTERN RECOGNITION, 2025, 161
  • [37] Balanced K-Means for Clustering
    Malinen, Mikko I.
    Franti, Pasi
    STRUCTURAL, SYNTACTIC, AND STATISTICAL PATTERN RECOGNITION, 2014, 8621 : 32 - 41
  • [38] Robust trimmed k-means
    Dorabiala, Olga
    Kutz, J. Nathan
    Aravkin, Aleksandr Y.
    PATTERN RECOGNITION LETTERS, 2022, 161 : 9 - 16
  • [39] Transformed K-means Clustering
    Goel, Anurag
    Majumdar, Angshul
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1526 - 1530
  • [40] K*-Means: An Effective and Efficient K-means Clustering Algorithm
    Qi, Jianpeng
    Yu, Yanwei
    Wang, Lihong
    Liu, Jinglei
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 242 - 249