Variable selection for high-dimensional genomic data with censored outcomes using group lasso prior

被引:8
|
作者
Lee, Kyu Ha [1 ,2 ]
Chakraborty, Sounak [3 ]
Sun, Jianguo [3 ]
机构
[1] Forsyth Inst, Epidemiol & Biostat Core, Cambridge, MA USA
[2] Harvard Sch Dent Med, Dept Oral Hlth Policy & Epidemiol, Boston, MA USA
[3] Univ Missouri, Dept Stat, Columbia, MO 65211 USA
基金
美国国家科学基金会;
关键词
Accelerated failure time model; Bayesian lasso; Gibbs sampler; Group lasso; Penalized regression; FAILURE TIME MODEL; MICROARRAY DATA; SURVIVAL ANALYSIS; HAZARD RATIOS; ELASTIC NET; COX MODEL; REGRESSION; PREDICTION; SHRINKAGE;
D O I
10.1016/j.csda.2017.02.014
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The variable selection problem is discussed in the context of high-dimensional failure time data arising from the accelerated failure time model. A data augmentation approach is employed in order to deal with censored survival times and to facilitate prior-posterior conjugacy. To identify a set of grouped relevant covariates, a shrinkage prior distribution is specified for regression coefficients mimicking the effect of group lasso penalty. It is noted that unlike the corresponding frequentist method, a Bayesian penalized regression approach cannot shrink the estimates of coefficients to exact zeros in general. Towards resolving the issue, a two-stage thresholding method that exploits the scaled neighbor-hood criterion and the Bayesian information criterion is devised. Simulation studies are performed to assess the robustness and performance of the proposed method in terms of variable selection accuracy and predictive power. The method is successfully applied to a set of microarray data on the individuals diagnosed with diffuse large B-cell lymphoma. In addition, an R package called psbcGroup, which can be downloaded freely from CRAN, is developed for the implementation of the methods. (C) 2017 Elsevier B.V. All rights reserved.
引用
收藏
页码:1 / 13
页数:13
相关论文
共 50 条
  • [41] Variable selection in high-dimensional partly linear additive models
    Lian, Heng
    JOURNAL OF NONPARAMETRIC STATISTICS, 2012, 24 (04) : 825 - 839
  • [42] VARIABLE SELECTION FOR HIGH DIMENSIONAL MULTIVARIATE OUTCOMES
    Sofer, Tamar
    Dicker, Lee
    Lin, Xihong
    STATISTICA SINICA, 2014, 24 (04) : 1633 - 1654
  • [43] High-Dimensional LASSO-Based Computational Regression Models: Regularization, Shrinkage, and Selection
    Emmert-Streib, Frank
    Dehmer, Matthias
    MACHINE LEARNING AND KNOWLEDGE EXTRACTION, 2019, 1 (01): : 359 - 383
  • [44] ET-Lasso: A New Efficient Tuning of Lasso-type Regularization for High-Dimensional Data
    Yang, Songshan
    Wen, Jiawei
    Zhan, Xiang
    Kifer, Daniel
    KDD'19: PROCEEDINGS OF THE 25TH ACM SIGKDD INTERNATIONAL CONFERENCCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2019, : 607 - 616
  • [45] ORACLE INEQUALITIES AND SELECTION CONSISTENCY FOR WEIGHTED LASSO IN HIGH-DIMENSIONAL ADDITIVE HAZARDS MODEL
    Zhang, Haixiang
    Sun, Liuquan
    Zhou, Yong
    Huang, Jian
    STATISTICA SINICA, 2017, 27 (04) : 1903 - 1920
  • [46] Comparative study of computational algorithms for the Lasso with high-dimensional, highly correlated data
    Kim, Baekjin
    Yu, Donghyeon
    Won, Joong-Ho
    APPLIED INTELLIGENCE, 2018, 48 (08) : 1933 - 1952
  • [47] RETRACTED: Robust Model Selection and Estimation for Censored Survival Data with High Dimensional Genomic Covariates (Retracted Article)
    Chen, Guorong
    Wang, Sijian
    Sun, Guannan
    Pan, Huanxue
    ACTA BIOTHEORETICA, 2019, 67 (03) : 225 - 251
  • [48] On the accuracy in high-dimensional linear models and its application to genomic selection
    Rabier, Charles-Elie
    Mangin, Brigitte
    Grusea, Simona
    SCANDINAVIAN JOURNAL OF STATISTICS, 2019, 46 (01) : 289 - 313
  • [49] Ranking based variable selection for censored data using AFT models
    Khan, Md Hasinur Rahaman
    Akhter, Marzan
    COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2024, 53 (06) : 2917 - 2939
  • [50] A hybrid deterministic-deterministic approach for high-dimensional Bayesian variable selection with a default prior
    Lee, Jieun
    Goh, Gyuhyeong
    COMPUTATIONAL STATISTICS, 2024, 39 (03) : 1659 - 1681