Bayesian feature selection in high-dimensional regression in presence of correlated noise

被引:1
作者
Feldman, Guy [1 ]
Bhadra, Anindya [1 ]
Kirshner, Sergey [1 ]
机构
[1] Purdue Univ, Dept Stat, 250 N Univ St, W Lafayette, IN 47907 USA
来源
STAT | 2014年 / 3卷 / 01期
关键词
Bayesian methods; genomics; graphical models; high-dimensional data; variable selection;
D O I
10.1002/sta4.60
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We consider the problem of feature selection in a high-dimensional multiple predictors, multiple responses regression setting. Assuming that regression errors are i.i.d. when they are in fact dependent leads to inconsistent and inefficient feature estimates. We relax the i.i.d. assumption by allowing the errors to exhibit a tree-structured dependence. This allows a Bayesian problem formulation with the error dependence structure treated as an auxiliary variable that can be integrated out analytically with the help of the matrix-tree theorem. Mixing over trees results in a flexible technique for modelling the graphical structure for the regression errors. Furthermore, the analytic integration results in a collapsed Gibbs sampler for feature selection that is computationally efficient. Our approach offers significant performance gains over the competing methods in simulations, especially when the features themselves are correlated. In addition to comprehensive simulation studies, we apply our method to a high-dimensional breast cancer data set to identify markers significantly associated with the disease. Copyright (C) 2014 John Wiley & Sons, Ltd.
引用
收藏
页码:258 / 272
页数:15
相关论文
共 50 条
  • [41] Optimal Feature Selection in High-Dimensional Discriminant Analysis
    Kolar, Mladen
    Liu, Han
    IEEE TRANSACTIONS ON INFORMATION THEORY, 2015, 61 (02) : 1063 - 1083
  • [42] Efficient feature selection filters for high-dimensional data
    Ferreira, Artur J.
    Figueiredo, Mario A. T.
    PATTERN RECOGNITION LETTERS, 2012, 33 (13) : 1794 - 1804
  • [43] Improved PSO for feature selection on high-dimensional datasets
    Tran, Binh (binh.tran@ecs.vuw.ac.nz), 1600, Springer Verlag (8886): : 503 - 515
  • [44] A comparison study of Bayesian high-dimensional linear regression models
    Shin, Ju-Won
    Lee, Kyoungjae
    KOREAN JOURNAL OF APPLIED STATISTICS, 2021, 34 (03) : 491 - 505
  • [45] A stepwise regression algorithm for high-dimensional variable selection
    Hwang, Jing-Shiang
    Hu, Tsuey-Hwa
    JOURNAL OF STATISTICAL COMPUTATION AND SIMULATION, 2015, 85 (09) : 1793 - 1806
  • [46] A Survey of Tuning Parameter Selection for High-Dimensional Regression
    Wu, Yunan
    Wang, Lan
    ANNUAL REVIEW OF STATISTICS AND ITS APPLICATION, VOL 7, 2020, 2020, 7 : 209 - 226
  • [47] High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso
    Yamada, Makoto
    Jitkrittum, Wittawat
    Sigal, Leonid
    Xing, Eric P.
    Sugiyama, Masashi
    NEURAL COMPUTATION, 2014, 26 (01) : 185 - 207
  • [48] High-dimensional Ising model selection with Bayesian information criteria
    Barber, Rina Foygel
    Drton, Mathias
    ELECTRONIC JOURNAL OF STATISTICS, 2015, 9 (01): : 567 - 607
  • [49] Sparse Bayesian variable selection for classifying high-dimensional data
    Yang, Aijun
    Lian, Heng
    Jiang, Xuejun
    Liu, Pengfei
    STATISTICS AND ITS INTERFACE, 2018, 11 (02) : 385 - 395
  • [50] Bayesian stein-type shrinkage estimators in high-dimensional linear regression models
    Zanboori, Ahmadreza
    Zanboori, Ehsan
    Mousavi, Maryam
    Mirjalili, Sayyed Mahmoud
    SAO PAULO JOURNAL OF MATHEMATICAL SCIENCES, 2024, 18 (02): : 1889 - 1914