PCA-guided search for K-means

被引:42
作者
Xu, Qin [1 ]
Ding, Chris [2 ]
Liu, Jinpei [3 ]
Luo, Bin [1 ]
机构
[1] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Anhui, Peoples R China
[2] Univ Texas Arlington, Dept Comp Sci & Engn, Arlington, TX 76019 USA
[3] Anhui Univ, Sch Business, Hefei 730601, Anhui, Peoples R China
基金
中国国家自然科学基金;
关键词
K-means; Principal component analysis; Cluster centroid initialization; Clustering; ALGORITHM;
D O I
10.1016/j.patrec.2014.11.017
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
K-means is undoubtedly the most widely used partitional clustering algorithm. Unfortunately, due to the non-convexity of the model formulations, expectation-maximization (EM) type algorithms converge to different local optima with different initializations. Recent discoveries have identified that the global solution of K-means cluster centroids lies in the principal component analysis (PCA) subspace. Based on this insight, we propose PCA-guided effective search for K-means. Because the PCA subspace is much smaller than the original space, searching in the PCA subspace is both more effective and efficient. Extensive experiments on four real world data sets and systematic comparison with previous algorithms demonstrate that our proposed method outperforms the rest as it makes the K-means more effective. (C) 2015 Elsevier B.V. All rights reserved.
引用
收藏
页码:50 / 55
页数:6
相关论文
共 50 条
  • [21] k-Means-MIND: An Efficient Alternative to Repetitive k-Means Runs
    Olukanmi, Peter O.
    Nelwamondo, Fuluffielo
    Marwala, Tshilidzi
    2020 7TH INTERNATIONAL CONFERENCE ON SOFT COMPUTING & MACHINE INTELLIGENCE (ISCMI 2020), 2020, : 172 - 176
  • [22] Soil data clustering by using K-means and fuzzy K-means algorithm
    Hot, Elma
    Popovic-Bugarin, Vesna
    2015 23RD TELECOMMUNICATIONS FORUM TELFOR (TELFOR), 2015, : 890 - 893
  • [23] Improving Intrusion Detection Using PCA And K-Means Clustering Algorithm
    Khaoula, Radi
    Mohamed, Moughit
    2022 9TH INTERNATIONAL CONFERENCE ON WIRELESS NETWORKS AND MOBILE COMMUNICATIONS, WINCOM, 2022, : 19 - 23
  • [24] A Hybrid Cuckoo Search and K-Means for Clustering Problem
    Girsang, Abba Suganda
    Yunanto, Ardian
    Aslamiah, Ayu Hidayah
    2017 INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING AND COMPUTER SCIENCE (ICECOS), 2017, : 120 - 124
  • [26] Cluster quality analysis based on SVD, PCA-based k-means and NMF techniques: an online survey data
    Mohanty H.
    Champati S.
    Barik B.L.P.
    Panda A.
    International Journal of Reasoning-based Intelligent Systems, 2023, 15 (01) : 86 - 96
  • [27] An Epileptic Signal Preictal Ictal Using PCA, K-MEANS and K Nearest Neighbors
    Noertjahjani, Siswandari
    Susanto, Adhi
    Hidayat, Risanuri
    Wibowo, Samekto
    2015 2ND INTERNATIONAL CONFERENCE ON INFORMATION TECHNOLOGY, COMPUTER, AND ELECTRICAL ENGINEERING (ICITACEE), 2015, : 202 - 206
  • [28] Seeding on Samples for Accelerating K-Means Clustering
    Low, Jia Shun
    Ghafoori, Zahra
    Bezdek, James C.
    Leckie, Christopher
    3RD INTERNATIONAL CONFERENCE ON BIG DATA AND INTERNET OF THINGS (BDIOT 2019), 2018, : 41 - 45
  • [29] Automatic estimation of clusters number for K-means
    Sabri, My Abdelouahed
    Ennouni, Assia
    Aarab, Abdellah
    2016 4TH IEEE INTERNATIONAL COLLOQUIUM ON INFORMATION SCIENCE AND TECHNOLOGY (CIST), 2016, : 450 - 454
  • [30] k-Means has Polynomial Smoothed Complexity
    Arthur, David
    Manthey, Bodo
    Roeglin, Heiko
    2009 50TH ANNUAL IEEE SYMPOSIUM ON FOUNDATIONS OF COMPUTER SCIENCE: FOCS 2009, PROCEEDINGS, 2009, : 405 - 414