Selecting Explanatory Variables with the Modified Version of the Bayesian Information Criterion

被引:25
|
作者
Bogdan, Malgorzata [1 ]
Ghosh, Jayanta K. [2 ,3 ]
Zak-Szatkowska, Malgorzata [1 ]
机构
[1] Wroclaw Univ Technol, Inst Math & Comp Sci, PL-20370 Wroclaw, Poland
[2] Purdue Univ, Dept Stat, W Lafayette, IN 47907 USA
[3] Indian Stat Inst, Kolkata, India
关键词
data mining; multiple regression; model selection; multiple testing; Bayes oracle;
D O I
10.1002/qre.936
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
We consider the situation in which a large database needs to be analyzed to identify a few important predictors of a given quantitative response variable. There is a lot of evidence that in this case classical model selection criteria, such as the Akaike information criterion or the Bayesian information criterion (BIC), have a strong tendency to overestimate the number of regressors. III our earlier papers, we developed the modified version of BIC (mBIC), which enables the incorporation of prior knowledge on a number of regressors and prevents overestimation. In this article, we review earlier results on mBIC and discuss the relationship of this criterion to the well-known Bonferroni correction for multiple testing and the Bayes oracle, which minimizes the expected costs of inference. We use computer simulations and a real data analysis to illustrate the performance of the original mBIC and its rank version, which is designed to deal with data that contain some outlying observations. Copyright (C) 2008 John Wiley & Sons, Ltd.
引用
收藏
页码:627 / 641
页数:15
相关论文
共 50 条
  • [1] Selecting velocity models using Bayesian Information Criterion
    Danek, Tomasz
    Gierlach, Bartosz
    Kaderali, Ayiaz
    Slawinski, Michael A.
    Stanoev, Theodore
    GEOPHYSICAL PROSPECTING, 2023, 71 (05) : 811 - 815
  • [2] Modified version of Bayesian Information criterion for localization of multiple interacting quantitative trait loci
    Bogdan, M
    Ghosh, JK
    Doerge, RW
    Biecek, P
    Baierl, A
    Futschik, A
    Frommlet, F
    ANNALS OF HUMAN GENETICS, 2005, 69 : 765 - 765
  • [3] Performance of Akaike Information Criterion and Bayesian Information Criterion in Selecting Partition Models and Mixture Models
    Liu, Qin
    Charleston, Michael A.
    Richards, Shane A.
    Holland, Barbara R.
    SYSTEMATIC BIOLOGY, 2023, 72 (01) : 92 - 105
  • [4] Akaike Information Criterion for Selecting Variables in the Nested Error Regression Model
    Kubokawa, Tatsuya
    Srivastava, Muni S.
    COMMUNICATIONS IN STATISTICS-THEORY AND METHODS, 2012, 41 (15) : 2626 - 2642
  • [5] Selecting explanatory variables for passenger demand model
    Petrous, Matej
    2020 SMART CITY SYMPOSIUM PRAGUE (SCSP), 2020,
  • [6] A CRITERION FOR SELECTING VARIABLES IN A REGRESSION-ANALYSIS
    LINHART, H
    PSYCHOMETRIKA, 1960, 25 (01) : 45 - 58
  • [7] CRITERION FOR SELECTING VARIABLES FOR LINEAR DISCRIMINANT FUNCTION
    MCLACHLAN, GJ
    BIOMETRICS, 1976, 32 (03) : 529 - 534
  • [8] Forest construction of Gaussian and discrete variables with the application of Watanabe Bayesian Information Criterion
    Islam A.
    Suzuki J.
    Behaviormetrika, 2024, 51 (2) : 589 - 616
  • [9] Modified versions of the Bayesian Information Criterion for sparse Generalized Linear Models
    Zak-Szatkowska, Malgorzata
    Bogdan, Malgorzata
    COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2011, 55 (11) : 2908 - 2924
  • [10] Generating Small, Accurate Acoustic Models with a Modified Bayesian Information Criterion
    Yu, Kai
    Rutenbar, Rob A.
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 1165 - 1168