CHull as an alternative to AIC and BIC in the context of mixtures of factor analyzers

被引:45
作者
Bulteel, Kirsten [1 ]
Wilderjans, Tom F. [1 ]
Tuerlinckx, Francis [1 ]
Ceulemans, Eva [1 ,2 ]
机构
[1] Katholieke Univ Leuven, B-3000 Louvain, Belgium
[2] Katholieke Univ Leuven, Methodol Educ Sci Res Grp, B-3000 Louvain, Belgium
关键词
Mixture analysis; Model selection; AIC; BIC; CHull; MODEL SELECTION; NUMBER; CRITERION; COMPLEXITIES; CLUSTERS;
D O I
10.3758/s13428-012-0293-y
中图分类号
B841 [心理学研究方法];
学科分类号
040201 ;
摘要
Mixture analysis is commonly used for clustering objects on the basis of multivariate data. When the data contain a large number of variables, regular mixture analysis may become problematic, because a large number of parameters need to be estimated for each cluster. To tackle this problem, the mixtures-of-factor-analyzers (MFA) model was proposed, which combines clustering with exploratory factor analysis. MFA model selection is rather intricate, as both the number of clusters and the number of underlying factors have to be determined. To this end, the Akaike (AIC) and Bayesian (BIC) information criteria are often used. AIC and BIC try to identify a model that optimally balances model fit and model complexity. In this article, the CHull (Ceulemans & Kiers, 2006) method, which also balances model fit and complexity, is presented as an interesting alternative model selection strategy for MFA. In an extensive simulation study, the performances of AIC, BIC, and CHull were compared. AIC performs poorly and systematically selects overly complex models, whereas BIC performs slightly better than CHull when considering the best model only. However, when taking model selection uncertainty into account by looking at the first three models retained, CHull outperforms BIC. This especially holds in more complex, and thus more realistic, situations (e. g., more clusters, factors, noise in the data, and overlap among clusters).
引用
收藏
页码:782 / 791
页数:10
相关论文
共 30 条
[1]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[2]  
ANDERSON EDGAR, 1936, ANN MISSOURI BOT GARD, V23, P457, DOI 10.2307/2394164
[3]  
Anderson Edgar, 1935, Bulletin of the American Iris Society, V59, P2
[4]  
[Anonymous], 2011, MIXTURES ESTIMATION
[5]  
[Anonymous], 2006, Model selection and model averaging, DOI DOI 10.1017/CBO9780511790485.003
[6]   Mixtures of Factor Analyzers with Common Factor Loadings: Applications to the Clustering and Visualization of High-Dimensional Data [J].
Baek, Jangsun ;
McLachlan, Geoffrey J. ;
Flack, Lloyd K. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (07) :1298-1309
[7]   SCREE TEST FOR NUMBER OF FACTORS [J].
CATTELL, RB .
MULTIVARIATE BEHAVIORAL RESEARCH, 1966, 1 (02) :245-276
[8]   An entropy criterion for assessing the number of clusters in a mixture model [J].
Celeux, G ;
Soromenho, G .
JOURNAL OF CLASSIFICATION, 1996, 13 (02) :195-212
[9]   Hierarchical classes models for three-way three-mode binary data: Interrelations and model selection [J].
Ceulemans, E ;
Van Mechelen, I .
PSYCHOMETRIKA, 2005, 70 (03) :461-480
[10]   Selecting among three-mode principal component models of different types and complexities: A numerical convex hull based method [J].
Ceulemans, E ;
Kiers, HAL .
BRITISH JOURNAL OF MATHEMATICAL & STATISTICAL PSYCHOLOGY, 2006, 59 :133-150