Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data

被引:34
作者
Wang, Zhu [1 ]
Wang, C. Y. [2 ]
机构
[1] Yale Univ, New Haven, CT 06520 USA
[2] Fred Hutchinson Canc Res Ctr, Seattle, WA 98109 USA
基金
美国国家卫生研究院;
关键词
boosting; accelerated failure time model; Buckley-James estimator; censored survival data; LASSO; variable selection; HIGH-ORDER INTERACTIONS; PARTIAL LEAST-SQUARES; REGULARIZED ESTIMATION; MICROARRAY DATA; REGRESSION; FAILURE; MODELS; GENES; REDUCTION; SELECTION;
D O I
10.2202/1544-6115.1550
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
There has been increasing interest in predicting patients' survival after therapy by investigating gene expression microarray data. In the regression and classification models with high-dimensional genomic data, boosting has been successfully applied to build accurate predictive models and conduct variable selection simultaneously. We propose the Buckley-James boosting for the semiparametric accelerated failure time models with right censored survival data, which can be used to predict survival of future patients using the high-dimensional genomic data. In the spirit of adaptive LASSO, twin boosting is also incorporated to fit more sparse models. The proposed methods have a unified approach to fit linear models, non-linear effects models with possible interactions. The methods can perform variable selection and parameter estimation simultaneously. The proposed methods are evaluated by simulations and applied to a recent microarray gene expression data set for patients with diffuse large B-cell lymphoma under the current gold standard therapy.
引用
收藏
页数:33
相关论文
共 71 条
[1]   Molecular Outcome Prediction in Diffuse Large-B-Cell Lymphoma [J].
Alizadeh, Ash A. ;
Gentles, Andrew J. ;
Lossos, Izidore S. ;
Levy, Ronald .
NEW ENGLAND JOURNAL OF MEDICINE, 2009, 360 (26) :2794-2795
[2]  
[Anonymous], 2007, Statistical Science, DOI DOI 10.1214/07-STS242A
[3]  
[Anonymous], 2000, Pattern Classification
[4]  
[Anonymous], 2000, Genome Biol.
[5]   Prediction by supervised principal components [J].
Bair, E ;
Hastie, T ;
Paul, D ;
Tibshirani, R .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2006, 101 (473) :119-137
[6]   Population-based case-control study of renin-angiotensin system genes polymorphisms and hypertension among Hispanics [J].
Bautista, Leonelo E. ;
Vargas, Clara I. ;
Orostegui, Myriam ;
Gamarra, German .
HYPERTENSION RESEARCH, 2008, 31 (03) :401-408
[7]   BETTER SUBSET REGRESSION USING THE NONNEGATIVE GARROTE [J].
BREIMAN, L .
TECHNOMETRICS, 1995, 37 (04) :373-384
[8]   Methodological issues in detecting gene-gene interactions in breast cancer susceptibility: a population-based study in Ontario [J].
Briollais, Laurent ;
Wang, Yuanyuan ;
Rajendram, Isaac ;
Onay, Venus ;
Shi, Ellen ;
Knight, Julia ;
Ozcelik, Hilmi .
BMC MEDICINE, 2007, 5 (1)
[9]  
BUCKLEY J, 1979, BIOMETRIKA, V66, P429
[10]   Boosting algorithms: Regularization, prediction and model fitting [J].
Buehlmann, Peter ;
Hothorn, Torsten .
STATISTICAL SCIENCE, 2007, 22 (04) :477-505