Subspace K-means clustering

被引:0
|
作者
Marieke E. Timmerman
Eva Ceulemans
Kim De Roover
Karla Van Leeuwen
机构
[1] University of Groningen,Heymans Institute for Psychology, Psychometrics & Statistics
[2] K.U. Leuven,Educational Sciences
[3] K.U. Leuven,Parenting and Special Education
来源
Behavior Research Methods | 2013年 / 45卷
关键词
Cluster analysis; Cluster recovery; Multivariate data; Reduced ; -means; means; Factorial ; -means; Mixtures of factor analyzers; MCLUST;
D O I
暂无
中图分类号
学科分类号
摘要
To achieve an insightful clustering of multivariate data, we propose subspace K-means. Its central idea is to model the centroids and cluster residuals in reduced spaces, which allows for dealing with a wide range of cluster types and yields rich interpretations of the clusters. We review the existing related clustering methods, including deterministic, stochastic, and unsupervised learning approaches. To evaluate subspace K-means, we performed a comparative simulation study, in which we manipulated the overlap of subspaces, the between-cluster variance, and the error variance. The study shows that the subspace K-means algorithm is sensitive to local minima but that the problem can be reasonably dealt with by using partitions of various cluster procedures as a starting point for the algorithm. Subspace K-means performs very well in recovering the true clustering across all conditions considered and appears to be superior to its competitor methods: K-means, reduced K-means, factorial K-means, mixtures of factor analyzers (MFA), and MCLUST. The best competitor method, MFA, showed a performance similar to that of subspace K-means in easy conditions but deteriorated in more difficult ones. Using data from a study on parental behavior, we show that subspace K-means analysis provides a rich insight into the cluster characteristics, in terms of both the relative positions of the clusters (via the centroids) and the shape of the clusters (via the within-cluster residuals).
引用
收藏
页码:1011 / 1023
页数:12
相关论文
共 50 条
  • [31] Research and Improvement on K-Means Clustering Algorithm
    Wang, Xue-mei
    Wang, Jin-bo
    PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION APPLICATIONS (ICCIA 2012), 2012, : 1138 - 1141
  • [32] Online K-Means Clustering with Lightweight Coresets
    Low, Jia Shun
    Ghafoori, Zahra
    Leckie, Christopher
    AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 191 - 202
  • [33] On the Discrepancy between Kleinberg’s Clustering Axioms and k-Means Clustering Algorithm Behavior
    Mieczysław Alojzy Kłopotek
    Robert Albert Kłopotek
    Machine Learning, 2023, 112 : 2501 - 2553
  • [34] Effect of cluster size distribution on clustering: a comparative study of k-means and fuzzy c-means clustering
    Kaile Zhou
    Shanlin Yang
    Pattern Analysis and Applications, 2020, 23 : 455 - 466
  • [35] Fast and exact out-of-core and distributed k-means clustering
    Ruoming Jin
    Anjan Goswami
    Gagan Agrawal
    Knowledge and Information Systems, 2006, 10 : 17 - 40
  • [36] Rainfall flow optimization based K-Means clustering for medical data
    Jaya Mabel Rani, Antony
    Pravin, Albert
    CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE, 2021, 33 (17)
  • [37] A robust fuzzy k-means clustering model for interval valued data
    Pierpaolo D’Urso
    Paolo Giordani
    Computational Statistics, 2006, 21 : 251 - 269
  • [38] Private Hospital Workflow Optimization via Secure k-Means Clustering
    Gabriele Spini
    Maran van Heesch
    Thijs Veugen
    Supriyo Chatterjea
    Journal of Medical Systems, 2020, 44
  • [39] Band depth based initialization of K-means for functional data clustering
    Javier Albert-Smet
    Aurora Torrente
    Juan Romo
    Advances in Data Analysis and Classification, 2023, 17 : 463 - 484
  • [40] To Discriminate General Election system in Thailand by using K-Means Clustering
    Phoonokniam, Siriya
    Kanchanasuntorn, Kanchana
    Vongmanee, Varin
    JOURNAL OF PHARMACEUTICAL NEGATIVE RESULTS, 2022, 13 : 771 - 782