A NON ASYMPTOTIC PENALIZED CRITERION FOR GAUSSIAN MIXTURE MODEL SELECTION

被引:31
作者
Maugis, Cathy [1 ]
Michel, Bertrand [2 ]
机构
[1] Univ Toulouse, Inst Math Toulouse, INSA Toulouse, F-31077 Toulouse 4, France
[2] Univ Paris 06, Lab Stat Theor & Appl, F-75013 Paris, France
关键词
Model-based clustering; variable selection; penalized likelihood criterion; bracketing entropy; MAXIMUM-LIKELIHOOD; CONVERGENCE; RATES;
D O I
10.1051/ps/2009004
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Specific Gaussian mixtures are considered to solve simultaneously variable selection and clustering problems. A non asymptotic penalized criterion is proposed to choose the number of mixture components and the relevant variable subset. Because of the non linearity of the associated Kullback-Leibler contrast on Gaussian mixtures, a general model selection theorem for maximum likelihood estimation proposed by [Massart Concentration inequalities and model selection Springer, Berlin (2007). Lectures from the 33rd Summer School on Probability Theory held in Saint-Flour, July 6-23 (2003)] is used to obtain the penalty function form. This theorem requires to control the bracketing entropy of Gaussian mixture families. The ordered and non-ordered variable selection cases are both addressed in this paper.
引用
收藏
页码:41 / 68
页数:28
相关论文
共 32 条
[1]  
Akaike H., 1973, 2 INTERNAT SYMPOS IN, P267, DOI [DOI 10.1007/978-1-4612-1694-0_15, 10.1007/978-1-4612-1694-0, 10.1007/978-1-4612-0919-5_38]
[2]  
[Anonymous], 2000, Sankhya Ser. A, DOI DOI 10.2307/25051289
[3]  
[Anonymous], 2002, Model selection and multimodel inference: a practical informationtheoretic approach
[4]  
[Anonymous], THESIS U PARIS SUD 1
[5]  
[Anonymous], 1997, Festschrift for lucien le cam
[6]  
Arlot S., 2008, J MACH LEAR IN PRESS
[7]   MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[8]   Risk bounds for model selection via penalization [J].
Barron, A ;
Birgé, L ;
Massart, P .
PROBABILITY THEORY AND RELATED FIELDS, 1999, 113 (03) :301-413
[9]   Assessing a mixture model for clustering with the integrated completed likelihood [J].
Biernacki, C ;
Celeux, G ;
Govaert, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (07) :719-725
[10]   Model-based cluster and discriminant analysis with the MIXMOD software [J].
Biernacki, Christophe ;
Celeux, Gilles ;
Govaert, Gerard ;
Langrognet, Florent .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 51 (02) :587-600