Sparse probabilistic K-means

被引:7
作者
Jung, Yoon Mo [1 ]
Whang, Joyce Jiyoung [2 ]
Yun, Sangwoon [3 ]
机构
[1] Sungkyunkwan Univ, Dept Math, Suwon 16419, South Korea
[2] Sungkyunkwan Univ, Dept Comp Sci & Engn, Suwon 16419, South Korea
[3] Sungkyunkwan Univ, Dept Math Educ, Seoul 03063, South Korea
基金
新加坡国家研究基金会;
关键词
Clustering; K-means; Alternating minimization; SELECTION;
D O I
10.1016/j.amc.2020.125328
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The goal of clustering is to partition a set of data points into groups of similar data points, called clusters. Clustering algorithms can be classified into two categories: hard and soft clustering. Hard clustering assigns each data point to one cluster exclusively. On the other hand, soft clustering allows probabilistic assignments to clusters. In this paper, we propose a new model which combines the benefits of these two models: clarity of hard clustering and probabilistic assignments of soft clustering. Since the majority of data usually have a clear association, only a few points may require a probabilistic interpretation. Thus, we apply the l(1) norm constraint to impose sparsity on probabilistic assignments. Moreover, we also incorporate outlier detection in our clustering model to simultaneously detect outliers which can cause serious problems in statistical analyses. To optimize the model, we introduce an alternating minimization method and prove its convergence. Numerical experiments and comparisons with existing models show the soundness and effectiveness of the proposed model. (C) 2020 Elsevier Inc. All rights reserved.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Robust trimmed k-means
    Dorabiala, Olga
    Kutz, J. Nathan
    Aravkin, Aleksandr Y.
    PATTERN RECOGNITION LETTERS, 2022, 161 : 9 - 16
  • [22] Transformed K-means Clustering
    Goel, Anurag
    Majumdar, Angshul
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 1526 - 1530
  • [23] K*-Means: An Effective and Efficient K-means Clustering Algorithm
    Qi, Jianpeng
    Yu, Yanwei
    Wang, Lihong
    Liu, Jinglei
    PROCEEDINGS OF 2016 IEEE INTERNATIONAL CONFERENCES ON BIG DATA AND CLOUD COMPUTING (BDCLOUD 2016) SOCIAL COMPUTING AND NETWORKING (SOCIALCOM 2016) SUSTAINABLE COMPUTING AND COMMUNICATIONS (SUSTAINCOM 2016) (BDCLOUD-SOCIALCOM-SUSTAINCOM 2016), 2016, : 242 - 249
  • [24] Improving Bregman k-means
    Ashour, Wesam
    Fyfe, Colin
    INTERNATIONAL JOURNAL OF DATA MINING MODELLING AND MANAGEMENT, 2014, 6 (01) : 65 - 82
  • [25] Comparative Study of K-Means, Pam and Rough K-Means Algorithms Using Cancer Datasets
    Kumar, Parvesh
    Wasan, Krishan
    COMPUTING, COMMUNICATION, AND CONTROL, 2011, 1 : 136 - 140
  • [26] RSKC: An R Package for a Robust and Sparse K-Means Clustering Algorithm
    Kondo, Yumi
    Salibian-Barrera, Matias
    Zamar, Ruben
    JOURNAL OF STATISTICAL SOFTWARE, 2016, 72 (05): : 1 - 26
  • [27] SPARSE CODING FOR SUPER-RESOLUTION VIA K-MEANS CLASSIFICATION
    Xiao Aoran
    Shao Zhenfeng
    Wang Zhongyuan
    2017 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2017,
  • [28] Classification of Hyperspectral Image Based on K-means and Structured Sparse Coding
    Liu, Yang
    Wang, Yangyang
    2016 3RD INTERNATIONAL CONFERENCE ON INFORMATION SCIENCE AND CONTROL ENGINEERING (ICISCE), 2016, : 248 - 251
  • [29] Sparse Weighted K-Means for Groups of Mixed-Type Variables
    Chavent, Marie
    Cottrell, Marie
    Lacaille, Jerome
    Mourer, Alex
    Olteanu, Madalina
    ADVANCES IN SELF-ORGANIZING MAPS, LEARNING VECTOR QUANTIZATION, CLUSTERING AND DATA VISUALIZATION: DEDICATED TO THE MEMORY OF TEUVO KOHONEN, WSOM+ 2022, 2022, 533 : 1 - 10
  • [30] Robust and sparse k-means clustering for high-dimensional data
    Brodinova, Sarka
    Filzmoser, Peter
    Ortner, Thomas
    Breiteneder, Christian
    Rohm, Maia
    ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2019, 13 (04) : 905 - 932