Extended BIC for linear regression models with diverging number of relevant features and high or ultra-high feature spaces

被引:26
作者
Luo, Shan [1 ]
Chen, Zehua [1 ]
机构
[1] Natl Univ Singapore, Dept Stat & Appl Probabil, Singapore 117543, Singapore
关键词
Diverging number of parameters; Feature selection; Extended Bayes information criterion; High dimensional feature space; Penalized likelihood; Selection consistency; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; SHRINKAGE; LASSO;
D O I
10.1016/j.jspi.2012.08.015
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
In many conventional scientific investigations with high or ultra-high dimensional feature spaces, the relevant features, though sparse, are large in number compared with classical statistical problems, and the magnitude of their effects tapers off. It is reasonable to model the number of relevant features as a diverging sequence when sample size increases. In this paper, we investigate the properties of the extended Bayes information criterion (EBIC) (Chen and Chen, 2008) for feature selection in linear regression models with diverging number of relevant features in high or ultra-high dimensional feature spaces. The selection consistency of the EBIC in this situation is established. The application of EBIC to feature selection is considered in a SCAD cum EBIC procedure. Simulation studies are conducted to demonstrate the performance of the SCAD cum EBIC procedure in finite sample cases. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:494 / 504
页数:11
相关论文
共 20 条
[1]  
Akaike H., 1973, 2 INT S INFORM THEOR, P267
[2]  
[Anonymous], 2000, AMS C MATH CHALLENGE
[3]   Modifying the Schwarz Bayesian information criterion to locate multiple interacting quantitative trait loci [J].
Bogdan, M ;
Ghosh, JK ;
Doerge, RW .
GENETICS, 2004, 167 (02) :989-999
[4]   A model selection approach for the identification of quantitative trait loci in experimental crosses [J].
Broman, KW ;
Speed, TP .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2002, 64 :641-656
[5]   Extended Bayesian information criteria for model selection with large model spaces [J].
Chen, Jiahua ;
Chen, Zehua .
BIOMETRIKA, 2008, 95 (03) :759-771
[6]   Tournament screening cum EBIC for feature selection with high-dimensional feature spaces [J].
Chen Zehua ;
Chen JiaHua .
SCIENCE IN CHINA SERIES A-MATHEMATICS, 2009, 52 (06) :1327-1341
[7]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[8]   Nonconcave penalized likelihood with a diverging number of parameters [J].
Fan, JQ ;
Peng, H .
ANNALS OF STATISTICS, 2004, 32 (03) :928-961
[9]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[10]   Smoothly Clipped Absolute Deviation on High Dimensions [J].
Kim, Yongdai ;
Choi, Hosik ;
Oh, Hee-Seok .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (484) :1665-1673