Efficient algorithms for robust estimation in linear mixed-effects models using the multivariate t distribution

被引:229
作者
Pinheiro, JC
Liu, CH
Wu, YN
机构
[1] Bell Labs, Lucent Technol, Murray Hill, NJ 07974 USA
[2] Univ Calif Los Angeles, Dept Stat, Los Angeles, CA 90095 USA
关键词
EM; ECM; ECME; longitudinal data; outliers; PX-EM; random effects; repeated measures;
D O I
10.1198/10618600152628059
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Linear mixed-effects models are frequently used to analyze repeated measures data. because they model flexibly the within-subject correlation often present in this type of data. The most popular linear mixed-effects model for a continuous response assumes normal distributions for the random effects and the within-subject errors, making it sensitive to outliers. Such outliers are more problematic for mixed-effects models than for fixed-effects models, because they may occur in the random effects, in the within-subject errors, or in both, making them harder to be detected in practice. Motivated by a real dataset from an orthodontic study, we propose a robust hierarchical linear mixed-effects model in which the random effects and the within-subject errors have multivariate t-distributions, with known or unknown degrees-of-freedom, which are allowed to vary with groups of subjects. By using a gamma-normal hierarchical structure, our model allows the identification and classification of both types of outliers, comparing favorably to other multivariate t models for robust estimation in mixed-effects models previously described in the literature, which use only the marginal distribution of the responses. Allowing for unknown degrees-of-freedom, which are estimated from the data. our model provides a balance between robustness and efficiency, leading to reliable results for valid inference. We describe and compare efficient EM-type algorithms, including ECM, ECME, and PX-EM, for maximum likelihood estimation in the robust multivariate t model. We compare the performance of the Gaussian and the multivariate t models under different patterns of outliers. Simulation results indicate that the multivariate t substantially outperforms the Gaussian model when outliers are present in the data, even in moderate amounts.
引用
收藏
页码:249 / 276
页数:28
相关论文
共 38 条
  • [1] [Anonymous], 1996, Journal of Computational and Graphical Statistics, DOI DOI 10.2307/1390777
  • [2] [Anonymous], 1988, Nonlinear regression analysis and its applications
  • [3] Barnett V., 1995, Outliers in statistical data
  • [4] BOX GEP, 1994, TIME SERIES ANAL
  • [5] MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM
    DEMPSTER, AP
    LAIRD, NM
    RUBIN, DB
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01): : 1 - 38
  • [6] Efron B., 1993, INTRO BOOTSTRAP, V1st ed., DOI DOI 10.1201/9780429246593
  • [7] Fang K. T., 1990, GEN MULTIVARIATE ANA
  • [8] GELMAN A, 1997, J ROY STAT SOC B, V59, P554
  • [9] Hampel F. R., 1986, ROBUST STAT APPROACH
  • [10] MAXIMUM-LIKELIHOOD ESTIMATION FOR MIXED ANALYSIS OF VARIANCE MODEL
    HARTLEY, HO
    RAO, JNK
    [J]. BIOMETRIKA, 1967, 54 : 93 - &