The impact of covariance misspecification in multivariate Gaussian mixtures on estimation and inference: an application to longitudinal modeling

被引:15
作者
Heggeseth, Brianna C. [1 ]
Jewell, Nicholas P. [1 ,2 ]
机构
[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[2] Univ Calif Berkeley, Div Biostat, Berkeley, CA 94720 USA
基金
美国国家科学基金会;
关键词
covariance; model misspecification; mixture models; Kullback-Leibler divergence; MAXIMUM-LIKELIHOOD-ESTIMATION; IDENTIFIABILITY; TRAJECTORIES; CONSISTENCY; VARIABLES; BIAS;
D O I
10.1002/sim.5729
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Multivariate Gaussian mixtures are a class of models that provide a flexible parametric approach for the representation of heterogeneous multivariate outcomes. When the outcome is a vector of repeated measurements taken on the same subject, there is often inherent dependence between observations. However, a common covariance assumption is conditional independencethat is, given the mixture component label, the outcomes for subjects are independent. In this paper, we study, through asymptotic bias calculations and simulation, the impact of covariance misspecification in multivariate Gaussian mixtures. Although maximum likelihood estimators of regression and mixing probability parameters are not consistent under misspecification, they have little asymptotic bias when mixture components are well separated or if the assumed correlation is close to the truth even when the covariance is misspecified. We also present a robust standard error estimator and show thatit outperforms conventional estimators in simulations and can indicate that the model is misspecified. Body mass index data from a national longitudinal study are used to demonstrate the effects of misspecification on potential inferences made in practice. Copyright (c) 2013 John Wiley & Sons, Ltd.
引用
收藏
页码:2790 / 2803
页数:14
相关论文
共 47 条
[21]   A SAS procedure based on mixture models for estimating developmental trajectories [J].
Jones, BL ;
Nagin, DS ;
Roeder, K .
SOCIOLOGICAL METHODS & RESEARCH, 2001, 29 (03) :374-393
[22]   CONSISTENCY OF THE MAXIMUM-LIKELIHOOD ESTIMATOR IN THE PRESENCE OF INFINITELY MANY INCIDENTAL PARAMETERS [J].
KIEFER, J ;
WOLFOWITZ, J .
ANNALS OF MATHEMATICAL STATISTICS, 1956, 27 (04) :887-906
[23]   DISCRETE PARAMETER VARIATION - EFFICIENT ESTIMATION OF A SWITCHING REGRESSION-MODEL [J].
KIEFER, NM .
ECONOMETRICA, 1978, 46 (02) :427-434
[24]  
Le Cam L. M., 1953, U CALIFORNIA PUBLICA, V1, P277
[25]  
Leisch Friedrich, 2004, Journal of Statistical Software, V11, P1, DOI [DOI 10.18637/JSS.V011.I08, 10.18637/jss.v011.i08]
[26]  
LIANG KY, 1986, BIOMETRIKA, V73, P13, DOI 10.1093/biomet/73.1.13
[27]   Bias from misspecification of the component variances in a normal mixture [J].
Lo, Yungtai .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (09) :2739-2747
[28]  
Mclachlan G., 1988, Mixture Models: Inference and Applications to Clustering, V38
[29]  
McNicholas PD, 2010, CAN J STAT, V38, P153
[30]  
Muthén B, 2000, ALCOHOL CLIN EXP RES, V24, P882, DOI 10.1111/j.1530-0277.2000.tb02070.x