Extending mixtures of multivariate t-factor analyzers

被引:76
作者
Andrews, Jeffrey L. [1 ]
McNicholas, Paul D. [1 ]
机构
[1] Univ Guelph, Dept Math & Stat, Guelph, ON N1G 2W1, Canada
基金
加拿大创新基金会; 加拿大自然科学与工程研究理事会;
关键词
Factor analysis; Latent variables; Mixture models; Model-based clustering; Multivariate t-distributions; t-Factor analyzers; MAXIMUM-LIKELIHOOD; MODEL; EM; ALGORITHM;
D O I
10.1007/s11222-010-9175-2
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Model-based clustering typically involves the development of a family of mixture models and the imposition of these models upon data. The best member of the family is then chosen using some criterion and the associated parameter estimates lead to predicted group memberships, or clusterings. This paper describes the extension of the mixtures of multivariate t-factor analyzers model to include constraints on the degrees of freedom, the factor loadings, and the error variance matrices. The result is a family of six mixture models, including parsimonious models. Parameter estimates for this family of models are derived using an alternating expectation-conditional maximization algorithm and convergence is determined based on Aitken's acceleration. Model selection is carried out using the Bayesian information criterion (BIC) and the integrated completed likelihood (ICL). This novel family of mixture models is then applied to simulated and real data where clustering performance meets or exceeds that of established model-based clustering methods. The simulation studies include a comparison of the BIC and the ICL as model selection techniques for this novel family of models. Application to simulated data with larger dimensionality is also explored.
引用
收藏
页码:361 / 373
页数:13
相关论文
共 53 条
[1]  
[Anonymous], 2000, Sankhya Ser. A, DOI DOI 10.2307/25051289
[2]  
[Anonymous], 2008, EM ALGORITHM EXTENSI
[3]  
[Anonymous], NSF CBMS REG C SER P
[4]  
[Anonymous], 0511 TRIN COLL DEP S
[5]  
[Anonymous], 2002, Algorithms for Minimization Without Derivatives
[6]   MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[7]   Assessing a mixture model for clustering with the integrated completed likelihood [J].
Biernacki, C ;
Celeux, G ;
Govaert, G .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2000, 22 (07) :719-725
[8]  
BINDER DA, 1978, BIOMETRIKA, V65, P31, DOI 10.2307/2335273
[9]   THE DISTRIBUTION OF THE LIKELIHOOD RATIO FOR MIXTURES OF DENSITIES FROM THE ONE-PARAMETER EXPONENTIAL FAMILY [J].
BOHNING, D ;
DIETZ, E ;
SCHAUB, R ;
SCHLATTMANN, P ;
LINDSAY, BG .
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 1994, 46 (02) :373-388
[10]   GAUSSIAN PARSIMONIOUS CLUSTERING MODELS [J].
CELEUX, G ;
GOVAERT, G .
PATTERN RECOGNITION, 1995, 28 (05) :781-793