Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data

被引:32
作者
Wang, Zhu [1 ]
Wang, C. Y. [2 ]
机构
[1] Yale Univ, New Haven, CT 06520 USA
[2] Fred Hutchinson Canc Res Ctr, Seattle, WA 98109 USA
基金
美国国家卫生研究院;
关键词
boosting; accelerated failure time model; Buckley-James estimator; censored survival data; LASSO; variable selection; HIGH-ORDER INTERACTIONS; PARTIAL LEAST-SQUARES; REGULARIZED ESTIMATION; MICROARRAY DATA; REGRESSION; FAILURE; MODELS; GENES; REDUCTION; SELECTION;
D O I
10.2202/1544-6115.1550
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
There has been increasing interest in predicting patients' survival after therapy by investigating gene expression microarray data. In the regression and classification models with high-dimensional genomic data, boosting has been successfully applied to build accurate predictive models and conduct variable selection simultaneously. We propose the Buckley-James boosting for the semiparametric accelerated failure time models with right censored survival data, which can be used to predict survival of future patients using the high-dimensional genomic data. In the spirit of adaptive LASSO, twin boosting is also incorporated to fit more sparse models. The proposed methods have a unified approach to fit linear models, non-linear effects models with possible interactions. The methods can perform variable selection and parameter estimation simultaneously. The proposed methods are evaluated by simulations and applied to a recent microarray gene expression data set for patients with diffuse large B-cell lymphoma under the current gold standard therapy.
引用
收藏
页数:33
相关论文
共 71 条
  • [1] Molecular Outcome Prediction in Diffuse Large-B-Cell Lymphoma
    Alizadeh, Ash A.
    Gentles, Andrew J.
    Lossos, Izidore S.
    Levy, Ronald
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2009, 360 (26) : 2794 - 2795
  • [2] [Anonymous], 2007, Statistical Science, DOI DOI 10.1214/07-STS242A
  • [3] [Anonymous], 2000, Pattern Classification
  • [4] [Anonymous], 2000, Genome Biol.
  • [5] Prediction by supervised principal components
    Bair, E
    Hastie, T
    Paul, D
    Tibshirani, R
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) : 119 - 137
  • [6] Population-based case-control study of renin-angiotensin system genes polymorphisms and hypertension among Hispanics
    Bautista, Leonelo E.
    Vargas, Clara I.
    Orostegui, Myriam
    Gamarra, German
    [J]. HYPERTENSION RESEARCH, 2008, 31 (03) : 401 - 408
  • [7] BETTER SUBSET REGRESSION USING THE NONNEGATIVE GARROTE
    BREIMAN, L
    [J]. TECHNOMETRICS, 1995, 37 (04) : 373 - 384
  • [8] Methodological issues in detecting gene-gene interactions in breast cancer susceptibility: a population-based study in Ontario
    Briollais, Laurent
    Wang, Yuanyuan
    Rajendram, Isaac
    Onay, Venus
    Shi, Ellen
    Knight, Julia
    Ozcelik, Hilmi
    [J]. BMC MEDICINE, 2007, 5 (1)
  • [9] BUCKLEY J, 1979, BIOMETRIKA, V66, P429
  • [10] Boosting algorithms: Regularization, prediction and model fitting
    Buehlmann, Peter
    Hothorn, Torsten
    [J]. STATISTICAL SCIENCE, 2007, 22 (04) : 477 - 505