The relative performance of AIC, AICC and BIC in the presence of unobserved heterogeneity

被引：246

作者：

Brewer, Mark J. ^{[1
]}

Butler, Adam ^{[2
]}

Cooksley, Susan L. ^{[3
]}

机构：

[1] Biomath & Stat Scotland, Aberdeen AB15 8QH, Scotland

[2] JCMB, Biomath & Stat Scotland, Kings Bldg, Edinburgh EH9 3JZ, Midlothian, Scotland

[3] James Hutton Inst, Aberdeen AB15 8QH, Scotland

来源：

METHODS IN ECOLOGY AND EVOLUTION | 2016年 / 7卷 / 06期

关键词：

Akaike Information Criterion; Bayesian Information Criterion; generalized linear models; likelihood penalization; linear regression; model selection; statistical controversies; MODEL SELECTION;

D O I：

10.1111/2041-210X.12541

中图分类号：

Q14 [生态学（生物生态学）];

学科分类号：

071012 ; 0713 ;

摘要：

Model selection is difficult. Even in the apparently straightforward case of choosing between standard linear regression models, there does not yet appear to be consensus in the statistical ecology literature as to the right approach. We review recent works on model selection in ecology and subsequently focus on one aspect in particular: the use of the Akaike Information Criterion (AIC) or its small-sample equivalent, AICC. We create a novel framework for simulation studies and use this to study model selection from simulated data sets with a range of properties, which differ in terms of degree of unobserved heterogeneity. We use the results of the simulation study to suggest an approach for model selection based on ideas from information criteria but requiring simulation. We find that the relative predictive performance of model selection by different information criteria is heavily dependent on the degree of unobserved heterogeneity between data sets. When heterogeneity is small, AIC or AICC are likely to perform well, but if heterogeneity is large, the Bayesian Information Criterion (BIC) will often perform better, due to the stronger penalty afforded. Our conclusion is that the choice of information criterion (or more broadly, the strength of likelihood penalty) should ideally be based upon hypothesized (or estimated from previous data) properties of the population of data sets from which a given data set could have arisen. Relying on a single form of information criterion is unlikely to be universally successful.

引用

页码：679 / 692

页数：14

共 29 条

[1] Model selection for ecologists: the worldviews of AIC and BIC [J].

Aho, Ken ;

Derryberry, DeWayne ;

Peterson, Teri .

ECOLOGY, 2014, 95 (03) :631-636

[2]

Akaike H., 1992, 2 INT S INF THEOR, P610, DOI [10.1007/978-1-4612-1694-0, 10.1007/978-1-4612-0919-538, 10.1007/978-1-4612-0919-5_38, 10.1007/978-0-387-98135-2, DOI 10.1007/978-1-4612-0919-538]

[3]

[Anonymous], 1966, APPL REGRESSION ANAL

[4]

[Anonymous], 2002, Model selection and multimodel inference: a practical informationtheoretic approach

[5]

[Anonymous], SOCIOL METHOD RES

[6]

[Anonymous], TECHNICAL REPORT

[7]

Barton K., 2015, MuMIn: Multi-model inference

[8] Model selection: An integral part of inference [J].

Buckland, ST ;

Burnham, KP ;

Augustin, NH .

BIOMETRICS, 1997, 53 (02) :603-618

[9] Model averaging and muddled multimodel inferences [J].

Cade, Brian S. .

ECOLOGY, 2015, 96 (09) :2370-2382

[10] Impacts of artificial structures on the freshwater pearl mussel Margaritifera margaritifera in the River Dee, Scotland [J].

Cooksley, Susan L. ;

Brewer, Mark J. ;

Donnelly, David ;

Spezia, Luigi ;

Tree, Angus .

AQUATIC CONSERVATION-MARINE AND FRESHWATER ECOSYSTEMS, 2012, 22 (03) :318-330

← 1 2 3 →