Selecting Explanatory Variables with the Modified Version of the Bayesian Information Criterion

被引:25
作者
Bogdan, Malgorzata [1 ]
Ghosh, Jayanta K. [2 ,3 ]
Zak-Szatkowska, Malgorzata [1 ]
机构
[1] Wroclaw Univ Technol, Inst Math & Comp Sci, PL-20370 Wroclaw, Poland
[2] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA
[3] Indian Stat Inst, Kolkata, India
关键词
data mining; multiple regression; model selection; multiple testing; Bayes oracle;
D O I
10.1002/qre.936
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
We consider the situation in which a large database needs to be analyzed to identify a few important predictors of a given quantitative response variable. There is a lot of evidence that in this case classical model selection criteria, such as the Akaike information criterion or the Bayesian information criterion (BIC), have a strong tendency to overestimate the number of regressors. III our earlier papers, we developed the modified version of BIC (mBIC), which enables the incorporation of prior knowledge on a number of regressors and prevents overestimation. In this article, we review earlier results on mBIC and discuss the relationship of this criterion to the well-known Bonferroni correction for multiple testing and the Bayes oracle, which minimizes the expected costs of inference. We use computer simulations and a real data analysis to illustrate the performance of the original mBIC and its rank version, which is designed to deal with data that contain some outlying observations. Copyright (C) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:627 / 641
页数:15
相关论文
共 26 条
[1]   Adapting to unknown sparsity by controlling the false discovery rate [J].
Abramovich, Felix ;
Benjamini, Yoav ;
Donoho, David L. ;
Johnstone, Iain M. .
ANNALS OF STATISTICS, 2006, 34 (02) :584-653
[2]   NEW LOOK AT STATISTICAL-MODEL IDENTIFICATION [J].
AKAIKE, H .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1974, AC19 (06) :716-723
[3]  
[Anonymous], ELEMENTS STAT LEARNI
[4]   Locating multiple interacting quantitative trait loci using robust model selection [J].
Baierl, Andreas ;
Futschik, Andreas ;
Bogdan, Malgorzata ;
Biecek, Przemyslaw .
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2007, 51 (12) :6423-6434
[5]   On locating multiple interacting quantitative trait loci in intercross designs [J].
Baierl, Andreas ;
Bogdan, Malgorzata ;
Frommlet, Florian ;
Futschik, Andreas .
GENETICS, 2006, 173 (03) :1693-1703
[6]   CONTROLLING THE FALSE DISCOVERY RATE - A PRACTICAL AND POWERFUL APPROACH TO MULTIPLE TESTING [J].
BENJAMINI, Y ;
HOCHBERG, Y .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 1995, 57 (01) :289-300
[7]   Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci [J].
Bogdan, M ;
Ghosh, JK ;
Doerge, RW .
GENETICS, 2004, 167 (02) :989-999
[8]  
BOGDAN M, 2003, 0403 PURD U DEP STAT
[9]  
BOGDAN M, 2008, BIOMETRICS, DOI DOI 10.1111/J.1541-0420.00989.X
[10]  
BOGDAN M, 2008, 11808P003 WROCL U TE