Extension of the mixture of factor analyzers model to incorporate the multivariate t-distribution

被引:93
作者
McLachlan, G. J. [1 ]
Bean, R. W.
Jones, L. Ben-Tovim
机构
[1] Univ Queensland, Dept Math, Brisbane, Qld 4072, Australia
[2] Univ Queensland, Inst Mol Biosci, Brisbane, Qld 4072, Australia
关键词
mixture modelling; factor analyzers; multivariate t-distribution; EM algorithm; MAXIMUM-LIKELIHOOD-ESTIMATION; ALGORITHM;
D O I
10.1016/j.csda.2006.09.015
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Mixtures of factor analyzers enable model-based density estimation to be undertaken for high-dimensional data, where the number of observations n is small relative to their dimension p. However, this approach is sensitive to outliers as it is based on a mixture model in which the multivariate normal family of distributions is assumed for the component error and factor distributions. An extension to mixtures of t-factor analyzers is considered, whereby the multivariate t-family is adopted for the component error and factor distributions. An EM-based algorithm is developed for the fitting of mixtures of t-factor analyzers. lts application is demonstrated in the clustering of some microarray gene-expression data. (C) 2006 Elsevier B.V. All rights reserved.
引用
收藏
页码:5327 / 5338
页数:12
相关论文
共 29 条
[1]  
[Anonymous], NCRG97003
[2]  
[Anonymous], 1995, NSF CBMS REG C SER P
[3]  
[Anonymous], LECT NOTES COMPUTER
[4]  
B?hning D., 1999, COMPUTER ASSISTED AN
[5]   MODEL-BASED GAUSSIAN AND NON-GAUSSIAN CLUSTERING [J].
BANFIELD, JD ;
RAFTERY, AE .
BIOMETRICS, 1993, 49 (03) :803-821
[6]   Some computational issues in cluster analysis with no a priori metric [J].
Coleman, D ;
Dong, XP ;
Hardin, J ;
Rocke, DM ;
Woodruff, DL .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 1999, 31 (01) :1-11
[7]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[8]  
Everitt B.S., 1981, Finite mixture distributions
[9]   Mixtures of factor analysers.: Bayesian estimation and inference by stochastic simulation [J].
Fokoué, E ;
Titterington, DM .
MACHINE LEARNING, 2003, 50 (1-2) :73-94
[10]   A CONSTRAINED FORMULATION OF MAXIMUM-LIKELIHOOD ESTIMATION FOR NORMAL MIXTURE DISTRIBUTIONS [J].
HATHAWAY, RJ .
ANNALS OF STATISTICS, 1985, 13 (02) :795-800