Model Based Screening Embedded Bayesian Variable Selection for Ultra-high Dimensional Settings

被引：2

作者：

Li, Dongjin ^{[1
]}

Dutta, Somak ^{[1
]}

Roy, Vivekananda ^{[1
]}

机构：

[1] Iowa State Univ, Dept Stat, Ames, IA USA

来源：

JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS | 2023年 / 32卷 / 01期

关键词：

GWAS; Hierarchical model; Posterior prediction; Shrinkage; Spike and slab; Stochastic search; Subset selection; STANDARD ERRORS; REGRESSION SHRINKAGE; OPTIMIZATION; CRITERIA; PRIORS;

D O I：

10.1080/10618600.2022.2074428

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

We develop a Bayesian variable selection method, called SVEN, based on a hierarchical Gaussian linear model with priors placed on the regression coefficients as well as on the model space. Sparsity is achieved by using degenerate spike priors on inactive variables, whereas Gaussian slab priors are placed on the coefficients for the important predictors making the posterior probability of a model available in explicit form (up to a normalizing constant). Embedding a unique model based screening and using fast Cholesky updates, SVEN produces a highly scalable computational framework to explore gigantic model spaces, rapidly identify the regions of high posterior probabilities and make fast inference and prediction. A temperature schedule is used to further mitigate multimodal posterior distributions. The temperature value is guided by our model selection consistency results which hold even when the norm of mean effects solely due to the unimportant variables diverges. An appealing byproduct of SVEN is the construction of novel model weight adjusted prediction intervals. The performance of SVEN is demonstrated through a number of simulation experiments and a real data example from a genome wide association study with over half a million markers. Supplementary materials for this article are available online.

引用

页码：61 / 73

页数：13

共 50 条

[1] Bayesian Multiresolution Variable Selection for Ultra-High Dimensional Neuroimaging Data
Zhao, Yize
Kang, Jian
Long, Qi
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, 2018, 15 (02) : 537 - 550
[2] Grouped variable screening for ultra-high dimensional data for linear model
Qiu, Debin
Ahn, Jeongyoun
COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2020, 144
[3] Forward Regression for Ultra-High Dimensional Variable Screening
Wang, Hansheng
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2009, 104 (488) : 1512 - 1524
[4] Semiparametric Bayesian information criterion for model selection in ultra-high dimensional additive models
Lian, Heng
JOURNAL OF MULTIVARIATE ANALYSIS, 2014, 123 : 304 - 310
[5] Bayesian Model Selection in High-Dimensional Settings
Johnson, Valen E.
Rossell, David
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2012, 107 (498) : 649 - 660
[6] A robust variable screening procedure for ultra-high dimensional data
Ghosh, Abhik
Thoresen, Magne
STATISTICAL METHODS IN MEDICAL RESEARCH, 2021, 30 (08) : 1816 - 1832
[7] Ultra-high dimensional variable selection for doubly robust causal inference
Tang, Dingke
Kong, Dehan
Pan, Wenliang
Wang, Linbo
BIOMETRICS, 2023, 79 (02) : 903 - 914
[8] Forward variable selection for ultra-high dimensional quantile regression models
Toshio Honda
Chien-Tong Lin
Annals of the Institute of Statistical Mathematics, 2023, 75 : 393 - 424
[9] Forward variable selection for ultra-high dimensional quantile regression models
Honda, Toshio
Lin, Chien-Tong
ANNALS OF THE INSTITUTE OF STATISTICAL MATHEMATICS, 2023, 75 (03) : 393 - 424
[10] Ultra-high dimensional variable screening via Gram–Schmidt orthogonalization
Huiwen Wang
Ruiping Liu
Shanshan Wang
Zhichao Wang
Gilbert Saporta
Computational Statistics, 2020, 35 : 1153 - 1170

← 1 2 3 4 5 →