Bayesian feature selection in high-dimensional regression in presence of correlated noise

被引:1
作者
Feldman, Guy [1 ]
Bhadra, Anindya [1 ]
Kirshner, Sergey [1 ]
机构
[1] Purdue Univ, Dept Stat, 250 N Univ St, W Lafayette, IN 47907 USA
来源
STAT | 2014年 / 3卷 / 01期
关键词
Bayesian methods; genomics; graphical models; high-dimensional data; variable selection;
D O I
10.1002/sta4.60
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of feature selection in a high-dimensional multiple predictors, multiple responses regression setting. Assuming that regression errors are i.i.d. when they are in fact dependent leads to inconsistent and inefficient feature estimates. We relax the i.i.d. assumption by allowing the errors to exhibit a tree-structured dependence. This allows a Bayesian problem formulation with the error dependence structure treated as an auxiliary variable that can be integrated out analytically with the help of the matrix-tree theorem. Mixing over trees results in a flexible technique for modelling the graphical structure for the regression errors. Furthermore, the analytic integration results in a collapsed Gibbs sampler for feature selection that is computationally efficient. Our approach offers significant performance gains over the competing methods in simulations, especially when the features themselves are correlated. In addition to comprehensive simulation studies, we apply our method to a high-dimensional breast cancer data set to identify markers significantly associated with the disease. Copyright (C) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:258 / 272
页数:15
相关论文
共 50 条
  • [21] A semi-parametric approach to feature selection in high-dimensional linear regression models
    Liu, Yuyang
    Pi, Pengfei
    Luo, Shan
    COMPUTATIONAL STATISTICS, 2023, 38 (02) : 979 - 1000
  • [22] Nearly optimal Bayesian shrinkage for high-dimensional regression
    Song, Qifan
    Liang, Faming
    SCIENCE CHINA-MATHEMATICS, 2023, 66 (02) : 409 - 442
  • [23] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    ECTA 2011/FCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON EVOLUTIONARY COMPUTATION THEORY AND APPLICATIONS AND INTERNATIONAL CONFERENCE ON FUZZY COMPUTATION THEORY AND APPLICATIONS, 2011,
  • [24] FEATURE SELECTION FOR HIGH-DIMENSIONAL DATA ANALYSIS
    Verleysen, Michel
    NCTA 2011: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON NEURAL COMPUTATION THEORY AND APPLICATIONS, 2011, : IS23 - IS25
  • [25] Adaptive Bayesian density regression for high-dimensional data
    Shen, Weining
    Ghosal, Subhashis
    BERNOULLI, 2016, 22 (01) : 396 - 420
  • [26] The sparsity and bias of the lasso selection in high-dimensional linear regression
    Zhang, Cun-Hui
    Huang, Jian
    ANNALS OF STATISTICS, 2008, 36 (04) : 1567 - 1594
  • [27] Feature selection for high-dimensional temporal data
    Michail Tsagris
    Vincenzo Lagani
    Ioannis Tsamardinos
    BMC Bioinformatics, 19
  • [28] Bayesian high-dimensional regression for change point analysis
    Datta, Abhirup
    Zou, Hui
    Banerjee, Sudipto
    STATISTICS AND ITS INTERFACE, 2019, 12 (02) : 253 - 264
  • [29] A Metropolized Adaptive Subspace Algorithm for High-Dimensional Bayesian Variable Selection
    Staerk, Christian
    Kateri, Maria
    Ntzoufras, Ioannis
    BAYESIAN ANALYSIS, 2024, 19 (01): : 261 - 291
  • [30] HDBRR: a statistical package for high-dimensional Bayesian ridge regression without MCMC
    Perez-Elizalde, Sergio
    Monroy-Castillo, Blanca E.
    Perez-Rodriguez, Paulino
    Crossa, Jose
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2022, 92 (17) : 3679 - 3705