Model-based clustering with determinant-and-shape constraint

被引:8
作者
Garcia-Escudero, Luis Angel [1 ,2 ]
Mayo-Iscar, Agustin [1 ,2 ]
Riani, Marco [3 ,4 ]
机构
[1] Univ Valladolid, Dept Stat & Operat Res, Valladolid, Spain
[2] Univ Valladolid, IMUVA, Valladolid, Spain
[3] Univ Parma, Dept Econ & Management, Parma, Italy
[4] Univ Parma, Interdept Ctr Robust Stat, Parma, Italy
关键词
Clustering; Constraints; Mixture modeling; Robustness; MAXIMUM-LIKELIHOOD-ESTIMATION; PARSIMONIOUS MIXTURES; FINITE MIXTURE; ROBUST; EM; ALGORITHM; NUMBER;
D O I
10.1007/s11222-020-09950-w
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Model-based approaches to cluster analysis and mixture modeling often involve maximizing classification and mixture likelihoods. Without appropriate constrains on the scatter matrices of the components, these maximizations result in ill-posed problems. Moreover, without constrains, non-interesting or "spurious" clusters are often detected by the EM and CEM algorithms traditionally used for the maximization of the likelihood criteria. Considering an upper bound on the maximal ratio between the determinants of the scatter matrices seems to be a sensible way to overcome these problems by affine equivariant constraints. Unfortunately, problems still arise without also controlling the elements of the "shape" matrices. A new methodology is proposed that allows both control of the scatter matrices determinants and also the shape matrices elements. Some theoretical justification is given. A fast algorithm is proposed for this doubly constrained maximization. The methodology is also extended to robust model-based clustering problems.
引用
收藏
页码:1363 / 1380
页数:18
相关论文
共 44 条
[1]   teigen: An R Package for Model-Based Clustering and Classification via the Multivariate t Distribution [J].
Andrews, Jeffrey L. ;
Wickins, Jaymeson R. ;
Boers, Nicholas M. ;
McNicholas, Paul D. .
JOURNAL OF STATISTICAL SOFTWARE, 2018, 83 (07) :1-32
[2]   Eigenvalues and constraints in mixture modeling: geometric and computational issues [J].
Angel Garcia-Escudero, Luis ;
Gordaliza, Alfonso ;
Greselin, Francesca ;
Ingrassia, Salvatore ;
Mayo-Iscar, Agustin .
ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, 2018, 12 (02) :203-233
[3]   The multivariate leptokurtic-normal distribution and its application in model-based clustering [J].
Bagnato, Luca ;
Punzo, Antonio ;
Zoia, Maria G. .
CANADIAN JOURNAL OF STATISTICS-REVUE CANADIENNE DE STATISTIQUE, 2017, 45 (01) :95-119
[4]   MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[5]   EM for mixtures Initialization requires special care [J].
Baudry, Jean-Patrick ;
Celeux, Gilles .
STATISTICS AND COMPUTING, 2015, 25 (04) :713-726
[6]   Degeneracy in the maximum likelihood estimation of univariate Gaussian mixtures with EM [J].
Biernacki, C ;
Chrétien, S .
STATISTICS & PROBABILITY LETTERS, 2003, 61 (04) :373-382
[7]   Stable and visualizable Gaussian parsimonious clustering models [J].
Biernacki, Christophe ;
Lourme, Alexandre .
STATISTICS AND COMPUTING, 2014, 24 (06) :953-969
[8]  
Browne R., 2013, PREPRINT
[9]   A CLASSIFICATION EM ALGORITHM FOR CLUSTERING AND 2 STOCHASTIC VERSIONS [J].
CELEUX, G ;
GOVAERT, G .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1992, 14 (03) :315-332
[10]   Finding the Number of Normal Groups in Model-Based Clustering via Constrained Likelihoods [J].
Cerioli, Andrea ;
Angel Garcia-Escudero, Luis ;
Mayo-Iscar, Agustin ;
Riani, Marco .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2018, 27 (02) :404-416