Dealing with collinearity in behavioural and ecological data: model averaging and the problems of measurement error

被引:0
作者
Robert P. Freckleton
机构
[1] University of Sheffield,Department of Animal and Plant Sciences
来源
Behavioral Ecology and Sociobiology | 2011年 / 65卷
关键词
Regression; Model selection; Information theory;
D O I
暂无
中图分类号
学科分类号
摘要
There has been a great deal of recent discussion of the practice of regression analysis (or more generally, linear modelling) in behaviour and ecology. In this paper, I wish to highlight two factors that have been under-considered, collinearity and measurement error in predictors, as well as to consider what happens when both exist at the same time. I examine what the consequences are for conventional regression analysis (ordinary least squares, OLS) as well as model averaging methods, typified by information theoretic approaches based around Akaike’s information criterion. Collinearity causes variance inflation of estimated slopes in OLS analysis, as is well known. In the presence of collinearity, model averaging reduces this variance for predictors with weak effects, but also can lead to parameter bias. When collinearity is strong or when all predictors have strong effects, model averaging relies heavily on the full model including all predictors and hence the results from this and OLS are essentially the same. I highlight that it is not safe to simply eliminate collinear variables without due consideration of their likely independent effects as this can lead to biases. Measurement error is also considered and I show that when collinearity exists, this can lead to extreme biases when predictors are collinear, have strong effects but differ in their degree of measurement error. I highlight techniques for dealing with and diagnosing these problems. These results reinforce that automated model selection techniques should not be relied on in the analysis of complex multivariable datasets.
引用
收藏
页码:91 / 101
页数:10
相关论文
共 52 条
  • [1] Carroll RJ(1984)On errors-in-variables for binary regression models Biometrika 71 19-25
  • [2] Spiegelman CH(1994)Simulation-extrapolation estimation in parametric error models J Am Stat Soc 89 1314-1328
  • [3] Gordon Lan KK(2006)Estimating density dependence, process noise and observation error Ecol Monogr 76 323-341
  • [4] Bailey KT(2002)Fitiing population dynamic models to time-series data by gradient matching Ecology 83 2256-2270
  • [5] Abbott RD(1988)Phylogenies and quantitative characters Ann Rev Ecolog Syst 19 445-471
  • [6] Cook JR(2003)Bayesian modelling of measurement error in predictor variables using item response theory Psychometrika 68 169-191
  • [7] Stefanski LA(2002)On the misuse of residuals in ecology: regression of residuals versus multiple regression J Anim Ecol 71 542-545
  • [8] Dennis B(1998)Yield of sugar beet in relation to weather and nutrients Agric For Meteorol 93 39-51
  • [9] Ponciano JM(2006)Census error and the detection of density dependence J Anim Ecol 75 837-851
  • [10] Lele SR(2001)On the misuse of residuals in ecology: testing regression residuals vs. the analysis of covariance J Anim Ecol 70 708-711