K-means and gaussian mixture modeling with a separation constraint

被引:0
作者
Jiang, He [1 ]
Arias-Castro, Ery [2 ]
机构
[1] Calif State Polytech Univ Pomona, Dept Math & Stat, Pomona, CA 91768 USA
[2] Univ Calif San Diego, Dept Math, La Jolla, CA USA
基金
美国国家科学基金会;
关键词
Clustering; Dynamic programming; Gaussian mixture models; K-means; Separation constraint; MAXIMUM-LIKELIHOOD-ESTIMATION; RESTRICTED EM ALGORITHM;
D O I
10.1080/03610918.2024.2354747
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of clustering with K-means and Gaussian mixture models with a constraint on the separation between the centers in the context of real-valued data. We first propose a dynamic programming approach to solving the K-means problem with a separation constraint on the centers, building on Wang and Song (2011). In the context of fitting a Gaussian mixture model, we then propose an EM algorithm that incorporates such a constraint. A separation constraint can help regularize the output of a clustering algorithm, and we provide both simulated and real data examples to illustrate this point.
引用
收藏
页数:15
相关论文
共 47 条
[41]  
Tan M, 2007, STAT SINICA, V17, P945
[42]   EM-type algorithms for computing restricted MLEs in multivariate normal distributions and multivariate t-distributions [J].
Tian, Guo-Liang ;
Ng, Kai Wang ;
Tan, Ming .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2008, 52 (10) :4768-4778
[43]  
Wagstaff K., 2001, ICML, V1, P577
[44]   Ckmeans.1d.dp: Optimal k-means Clustering in One Dimension by Dynamic Programming [J].
Wang, Haizhou ;
Song, Mingzhou .
R JOURNAL, 2011, 3 (02) :29-33
[45]   Variable selection for model-based high-dimensional clustering and its application to microarray data [J].
Wang, Sijian ;
Zhu, Ji .
BIOMETRICS, 2008, 64 (02) :440-448
[46]   Likelihood-based approaches for multivariate linear models under inequality constraints for incomplete data [J].
Zheng, Shurong ;
Guo, Jianhua ;
Shi, Ning-Zhong ;
Tian, Guo-Liang .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2012, 142 (11) :2926-2942
[47]   The restricted EM algorithm under linear inequalities in a linear model with missing data [J].
Zheng, SR ;
Shi, NZ ;
Guo, JH .
SCIENCE IN CHINA SERIES A-MATHEMATICS, 2005, 48 (06) :819-828