For testing the significance of regression coefficients, go ahead and log-transform count data

被引:176
作者
Ives, Anthony R. [1 ]
机构
[1] Univ Wisconsin, Dept Zool, Madison, WI 53706 USA
来源
METHODS IN ECOLOGY AND EVOLUTION | 2015年 / 6卷 / 07期
基金
美国国家科学基金会;
关键词
generalized linear mixed models; generalized linear models; least-squares regression; linear models; transformation; type I errors; LINEAR MIXED MODELS; ECOLOGY;
D O I
10.1111/2041-210X.12386
中图分类号
Q14 [生态学(生物生态学)];
学科分类号
071012 ; 0713 ;
摘要
The rise in the use of statistical models for non-Gaussian data, such as generalized linear models (GLMs) and generalized linear mixed models (GLMMs), is pushing aside the traditional approach of transforming data and applying least-squares linear models (LMs). Nonetheless, many least-squares statistical tests depend on the variance of the sum of residuals, which by the Central Limit Theoremconverge to a Gaussian distribution for large sample sizes. Therefore, least-squares LMs will likely have good performance in assessing the statistical significance of regression coefficients. Using simulations of count data, I compared GLM approaches for testing whether regression coefficients differ from zero with the traditional approach of applying LMs to transformed data. Simulations assumed that variation among sample populations was either (i) negative binomial or (ii) log-normal Poisson (i.e. log-normal variation among populations that were then sampled by a Poisson distribution). I used the simulated data to conduct tests of the hypotheses that regression coefficients differed from zero; I did not investigate statistical properties of the coefficient estimators, such as bias and precision. For negative binomial simulations whose assumptions closely matched the GLMs, the GLMs were nonetheless prone to type I errors (false positives) especially when there was more than one predictor (independent) variable. After correcting for type I errors, however, the GLMs provided slightly better statistical power than LMs. For log-normal-Poisson simulations, both a GLMM and the LMs performed well, but under some simulated conditions the GLMs had high type I error rates, a deadly sin for statistical tests. These results show that, while GLMs have slight advantages in power when they are properly specified, they can lead to badly wrong conclusions about the significance of regression coefficients if they are mis-specified. In contrast, transforming data and applying least-squares linear analyses provide robust statistical tests for significance over a wide range of conditions. Thus, the traditional approach of transforming data and applying LMs is still useful.
引用
收藏
页码:828 / 835
页数:8
相关论文
共 26 条
[1]  
[Anonymous], 2007, DATA ANAL USING REGR, DOI DOI 10.1017/CBO9780511790942
[2]  
[Anonymous], ECOLOGY
[3]   Fitting Linear Mixed-Effects Models Using lme4 [J].
Bates, Douglas ;
Maechler, Martin ;
Bolker, Benjamin M. ;
Walker, Steven C. .
JOURNAL OF STATISTICAL SOFTWARE, 2015, 67 (01) :1-48
[4]   Generalized linear mixed models: a practical guide for ecology and evolution [J].
Bolker, Benjamin M. ;
Brooks, Mollie E. ;
Clark, Connie J. ;
Geange, Shane W. ;
Poulsen, John R. ;
Stevens, M. Henry H. ;
White, Jada-Simone S. .
TRENDS IN ECOLOGY & EVOLUTION, 2009, 24 (03) :127-135
[5]  
Bolker BM, 2008, GLMM FAQ
[6]   Analyzing over-dispersed count data in two-way cross-classification problems using generalized linear models [J].
Campbell, NL ;
Young, LJ ;
Capuano, GA .
JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 1999, 63 (03) :263-281
[7]  
Engle R.F., 1984, HDB ECONOMETRICS, V2, P775, DOI [DOI 10.1016/S1573-4412(84)02005-5, https://doi.org/10.1016/S1573-4412(84)02005-5]
[8]   MCMC Methods for Multi-Response Generalized Linear Mixed Models: The MCMCglmm R Package [J].
Hadfield, Jarrod D. .
JOURNAL OF STATISTICAL SOFTWARE, 2010, 33 (02) :1-22
[9]  
Ives A.R., 2014, Modern Phylogenetic Comparative Methods and Their Application in Evolutionary Biology, P231, DOI DOI 10.1007/978-3-662-43550-2_9
[10]   Generalized linear mixed models for phylogenetic analyses of community structure [J].
Ives, Anthony R. ;
Helmus, Matthew R. .
ECOLOGICAL MONOGRAPHS, 2011, 81 (03) :511-525