Sparse probabilistic K-means

被引:7
作者
Jung, Yoon Mo [1 ]
Whang, Joyce Jiyoung [2 ]
Yun, Sangwoon [3 ]
机构
[1] Sungkyunkwan Univ, Dept Math, Suwon 16419, South Korea
[2] Sungkyunkwan Univ, Dept Comp Sci & Engn, Suwon 16419, South Korea
[3] Sungkyunkwan Univ, Dept Math Educ, Seoul 03063, South Korea
基金
新加坡国家研究基金会;
关键词
Clustering; K-means; Alternating minimization; SELECTION;
D O I
10.1016/j.amc.2020.125328
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The goal of clustering is to partition a set of data points into groups of similar data points, called clusters. Clustering algorithms can be classified into two categories: hard and soft clustering. Hard clustering assigns each data point to one cluster exclusively. On the other hand, soft clustering allows probabilistic assignments to clusters. In this paper, we propose a new model which combines the benefits of these two models: clarity of hard clustering and probabilistic assignments of soft clustering. Since the majority of data usually have a clear association, only a few points may require a probabilistic interpretation. Thus, we apply the l(1) norm constraint to impose sparsity on probabilistic assignments. Moreover, we also incorporate outlier detection in our clustering model to simultaneously detect outliers which can cause serious problems in statistical analyses. To optimize the model, we introduce an alternating minimization method and prove its convergence. Numerical experiments and comparisons with existing models show the soundness and effectiveness of the proposed model. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] Global k-means plus plus : an effective relaxation of the global k-means clustering algorithm
    Vardakas, Georgios
    Likas, Aristidis
    [J]. APPLIED INTELLIGENCE, 2024, 54 (19) : 8876 - 8888
  • [32] Three-way k-means: integrating k-means and three-way decision
    Wang, Pingxin
    Shi, Hong
    Yang, Xibei
    Mi, Jusheng
    [J]. INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2019, 10 (10) : 2767 - 2777
  • [33] Empirical Evaluation of K-Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means Clustering Algorithms
    Banerjee, Shreya
    Choudhary, Ankit
    Pal, Somnath
    [J]. 2015 IEEE INTERNATIONAL WIE CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (WIECON-ECE), 2015, : 172 - 176
  • [34] A k-means based co-clustering (kCC) algorithm for sparse, high dimensional data
    Hussain, Syed Fawad
    Haris, Muhammad
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2019, 118 : 20 - 34
  • [35] The k-means Algorithm: A Comprehensive Survey and Performance Evaluation
    Ahmed, Mohiuddin
    Seraj, Raihan
    Islam, Syed Mohammed Shamsul
    [J]. ELECTRONICS, 2020, 9 (08) : 1 - 12
  • [36] t-k-means: A ROBUST AND STABLE k-means VARIANT
    Li, Yiming
    Zhang, Yang
    Tang, Qingtao
    Huang, Weipeng
    Jiang, Yong
    Xia, Shu-Tao
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3120 - 3124
  • [37] Clustering of Image Data Using K-Means and Fuzzy K-Means
    Rahmani, Md. Khalid Imam
    Pal, Naina
    Arora, Kamiya
    [J]. INTERNATIONAL JOURNAL OF ADVANCED COMPUTER SCIENCE AND APPLICATIONS, 2014, 5 (07) : 160 - 163
  • [38] Improving Clustering Method Performance Using K-Means, Mini Batch K-Means, BIRCH and Spectral
    Wahyuningrum, Tenia
    Khomsah, Siti
    Suyanto, Suyanto
    Meliana, Selly
    Yunanto, Prasti Eko
    Al Maki, Wikky F.
    [J]. 2021 4TH INTERNATIONAL SEMINAR ON RESEARCH OF INFORMATION TECHNOLOGY AND INTELLIGENT SYSTEMS (ISRITI 2021), 2020,
  • [39] PSO Aided k-Means Clustering: Introducing Connectivity in k-Means
    Breaban, Mihaela Elena
    Luchian, Henri
    [J]. GECCO-2011: PROCEEDINGS OF THE 13TH ANNUAL GENETIC AND EVOLUTIONARY COMPUTATION CONFERENCE, 2011, : 1227 - 1234
  • [40] Density K-means : A New Algorithm for Centers Initialization for K-means
    Lan, Xv
    Li, Qian
    Zheng, Yi
    [J]. PROCEEDINGS OF 2015 6TH IEEE INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE, 2015, : 958 - 961