Sparse probabilistic K-means

被引：7

作者：

Jung, Yoon Mo ^{[1
]}

Whang, Joyce Jiyoung ^{[2
]}

Yun, Sangwoon ^{[3
]}

机构：

[1] Sungkyunkwan Univ, Dept Math, Suwon 16419, South Korea

[2] Sungkyunkwan Univ, Dept Comp Sci & Engn, Suwon 16419, South Korea

[3] Sungkyunkwan Univ, Dept Math Educ, Seoul 03063, South Korea

来源：

APPLIED MATHEMATICS AND COMPUTATION | 2020年 / 382卷

基金：

新加坡国家研究基金会;

关键词：

Clustering; K-means; Alternating minimization; SELECTION;

D O I：

10.1016/j.amc.2020.125328

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The goal of clustering is to partition a set of data points into groups of similar data points, called clusters. Clustering algorithms can be classified into two categories: hard and soft clustering. Hard clustering assigns each data point to one cluster exclusively. On the other hand, soft clustering allows probabilistic assignments to clusters. In this paper, we propose a new model which combines the benefits of these two models: clarity of hard clustering and probabilistic assignments of soft clustering. Since the majority of data usually have a clear association, only a few points may require a probabilistic interpretation. Thus, we apply the l(1) norm constraint to impose sparsity on probabilistic assignments. Moreover, we also incorporate outlier detection in our clustering model to simultaneously detect outliers which can cause serious problems in statistical analyses. To optimize the model, we introduce an alternating minimization method and prove its convergence. Numerical experiments and comparisons with existing models show the soundness and effectiveness of the proposed model. (C) 2020 Elsevier Inc. All rights reserved.

引用

页数：12

共 50 条

[1] Sparse Subspace K-means
Diallo, Abdoul Wahab
Niang, Ndeye
Ouattara, Mory
21ST IEEE INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS ICDMW 2021, 2021, : 678 - 685
[2] Deterministic Coresets for k-Means of Big Sparse Data
Barger, Artem
Feldman, Dan
ALGORITHMS, 2020, 13 (04)
[3] Sparse Multi-View K-Means Clustering
Yang, Miin-Shen
Parveen, Shazia
IEEE ACCESS, 2025, 13 : 46773 - 46793
[4] On Probabilistic k-Richness of the k-Means Algorithms
Klopotek, Robert A.
Klopotek, Mieczyslaw A.
MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, 2019, 11943 : 259 - 271
[5] Improved Sparse Prototyping for Relational K-means
Cherki, Safouane
Rastin, Parisa
Cabanes, Guenael
Basarab, Matei
PROCEEDINGS OF 2016 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2016,
[6] Efficient Sparse Spherical k-Means for Document Clustering
Knittel, Johannes
Koch, Steffen
Ertl, Thomas
PROCEEDINGS OF THE 21ST ACM SYMPOSIUM ON DOCUMENT ENGINEERING (DOCENG '21), 2021,
[7] Deep k-Means: Jointly clustering with k-Means and learning representations
Fard, Maziar Moradi
Thonet, Thibaut
Gaussier, Eric
PATTERN RECOGNITION LETTERS, 2020, 138 : 185 - 192
[8] Unsupervised K-Means Clustering Algorithm
Sinaga, Kristina P.
Yang, Miin-Shen
IEEE ACCESS, 2020, 8 : 80716 - 80727
[9] k-means: A revisit
Zhao, Wan-Lei
Deng, Cheng-Hao
Ngo, Chong-Wah
NEUROCOMPUTING, 2018, 291 : 195 - 206
[10] CPI-model-based analysis of sparse k-means clustering algorithms
Aoyama, Kazuo
Saito, Kazumi
Ikeda, Tetsuo
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2021, 12 (03) : 229 - 248

← 1 2 3 4 5 →