EXTENDED BIC FOR SMALL-n-LARGE-P SPARSE GLM

被引:144
作者
Chen, Jiahua [1 ]
Chen, Zehua [2 ]
机构
[1] Univ British Columbia, Dept Stat, Vancouver, BC V6T 1Z4, Canada
[2] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore 117543, Singapore
关键词
Consistency; exponential family; extended Bayes information criterion; feature selection; generalized linear model; small-n-large-P; VARIABLE SELECTION; MODEL SELECTION; MULTIPLE; REGULARIZATION; CRITERION; LOCI;
D O I
10.5705/ss.2010.216
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
The small-n-large-P situation has become common in genetics research, medical studies, risk management, and other fields. Feature selection is crucial in these studies yet poses a serious challenge. The traditional criteria such as AIC, BIC, and cross-validation choose too many features. In this paper, we examine the variable selection problem under the generalized linear models. We study the approach where a prior takes specific account of the small-n-large-P situation. The criterion is shown to be variable selection consistent under generalized linear models. We also report simulation results and a data analysis to illustrate the effectiveness of EBIC for feature selection.
引用
收藏
页码:555 / 574
页数:20
相关论文
共 32 条
[31]   Regularization and variable selection via the elastic net [J].
Zou, H ;
Hastie, T .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2005, 67 :301-320
[32]  
[No title captured]