Robustness of linear mixed-effects models to violations of distributional assumptions

被引:845
作者
Schielzeth, Holger [1 ]
Dingemanse, Niels J. [2 ]
Nakagawa, Shinichi [3 ,4 ]
Westneat, David F. [5 ]
Allegue, Hassen [6 ]
Teplitsky, Celine [7 ]
Reale, Denis [6 ]
Dochtermann, Ned A. [8 ]
Garamszegi, Laszlo Zsolt [9 ,10 ]
Araya-Ajoy, Yimen G. [11 ]
机构
[1] Friedrich Schiller Univ, Inst Ecol & Evolut, Jena, Germany
[2] Ludwig Maximilians Univ Munchen, Dept Biol, Behav Ecol, Planegg Martinsried, Germany
[3] Univ New South Wales, Evolut & Ecol Res Ctr, Sydney, NSW, Australia
[4] Univ New South Wales, Sch Biol Earth & Environm Sci, Sydney, NSW, Australia
[5] Univ Kentucky, Dept Biol, Lexington, KY USA
[6] Univ Quebec Montreal, Dept Sci Biol, Montreal, PQ, Canada
[7] CNRS, Ctr Ecol Fonct & Evolut, Montpellier, France
[8] North Dakota State Univ, Dept Biol Sci, Fargo, ND USA
[9] Inst Ecol & Bot, Ctr Ecol Res, Vacratot, Hungary
[10] Eotvos Lorand Univ, Dept Plant Systemat Ecol & Theoret Biol, Theoret Biol & Evolutionary Ecol Res Grp, MTA ELTE, Budapest, Hungary
[11] Norwegian Univ Sci & Technol NTNU, Dept Biol, Ctr Biodivers Dynam CBD, Trondheim, Norway
来源
METHODS IN ECOLOGY AND EVOLUTION | 2020年 / 11卷 / 09期
基金
美国国家科学基金会;
关键词
biostatistics; correlated predictors; distributional assumptions; linear mixed-effects models; missing random effects; statistical quantification of individual differences (SQuID); PRACTICAL GUIDE; INFERENCE;
D O I
10.1111/2041-210X.13434
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
Linear mixed-effects models are powerful tools for analysing complex datasets with repeated or clustered observations, a common data structure in ecology and evolution. Mixed-effects models involve complex fitting procedures and make several assumptions, in particular about the distribution of residual and random effects. Violations of these assumptions are common in real datasets, yet it is not always clear how much these violations matter to accurate and unbiased estimation. Here we address the consequences of violations in distributional assumptions and the impact of missing random effect components on model estimates. In particular, we evaluate the effects of skewed, bimodal and heteroscedastic random effect and residual variances, of missing random effect terms and of correlated fixed effect predictors. We focus on bias and prediction error on estimates of fixed and random effects. Model estimates were usually robust to violations of assumptions, with the exception of slight upward biases in estimates of random effect variance if the generating distribution was bimodal but was modelled by Gaussian error distributions. Further, estimates for (random effect) components that violated distributional assumptions became less precise but remained unbiased. However, this particular problem did not affect other parameters of the model. The same pattern was found for strongly correlated fixed effects, which led to imprecise, but unbiased estimates, with uncertainty estimates reflecting imprecision. Unmodelled sources of random effect variance had predictable effects on variance component estimates. The pattern is best viewed as a cascade of hierarchical grouping factors. Variances trickle down the hierarchy such that missing higher-level random effect variances pool at lower levels and missing lower-level and crossed random effect variances manifest as residual variance. Overall, our results show remarkable robustness of mixed-effects models that should allow researchers to use mixed-effects models even if the distributional assumptions are objectively violated. However, this does not free researchers from careful evaluation of the model. Estimates that are based on data that show clear violations of key assumptions should be treated with caution because individual datasets might give highly imprecise estimates, even if they will be unbiased on average across datasets.
引用
收藏
页码:1141 / 1152
页数:12
相关论文
共 50 条
[1]   Statistical Quantification of Individual Differences (SQuID): an educational and statistical tool for understanding multilevel phenotypic data in linear mixed models [J].
Allegue, Hassen ;
Araya-Ajoy, Yimen G. ;
Dingemanse, Niels J. ;
Dochtermann, Ned A. ;
Garamszegi, Laszlo Z. ;
Nakagawa, Shinichi ;
Reale, Denis ;
Schielzeth, Holger ;
Westneat, David F. .
METHODS IN ECOLOGY AND EVOLUTION, 2017, 8 (02) :257-267
[2]   A Note on the Indeterminacy of the Random-Effects Distribution in Hierarchical Models [J].
Alonso, Ariel ;
Litiere, Saskia ;
Laenen, Annouschka .
AMERICAN STATISTICIAN, 2010, 64 (04) :318-324
[3]  
[Anonymous], 2012, Multilevel analysis: An introduction to basic and advanced multilevel modeling
[4]  
[Anonymous], 2002, Experimental design and data analysis for biologists
[5]   The effect of skewness and kurtosis on the robustness of linear mixed models [J].
Arnau, Jaume ;
Bendayan, Rebecca ;
Blanca, Maria J. ;
Bono, Roser .
BEHAVIOR RESEARCH METHODS, 2013, 45 (03) :873-879
[6]   Fitting Linear Mixed-Effects Models Using lme4 [J].
Bates, Douglas ;
Maechler, Martin ;
Bolker, Benjamin M. ;
Walker, Steven C. .
JOURNAL OF STATISTICAL SOFTWARE, 2015, 67 (01) :1-48
[7]  
Bolker B., 2007, Ecological Models and Data in R
[8]   Generalized linear mixed models: a practical guide for ecology and evolution [J].
Bolker, Benjamin M. ;
Brooks, Mollie E. ;
Clark, Connie J. ;
Geange, Shane W. ;
Poulsen, John R. ;
Stevens, M. Henry H. ;
White, Jada-Simone S. .
TRENDS IN ECOLOGY & EVOLUTION, 2009, 24 (03) :127-135
[9]   SAMPLING AND BAYES INFERENCE IN SCIENTIFIC MODELING AND ROBUSTNESS [J].
BOX, GEP .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES A-STATISTICS IN SOCIETY, 1980, 143 :383-430
[10]   VISUALIZING AND QUANTIFYING NATURAL-SELECTION [J].
BRODIE, ED ;
MOORE, AJ ;
JANZEN, FJ .
TRENDS IN ECOLOGY & EVOLUTION, 1995, 10 (08) :313-318