Buckley-James Boosting for Survival Analysis with High-Dimensional Biomarker Data

被引:32
作者
Wang, Zhu [1 ]
Wang, C. Y. [2 ]
机构
[1] Yale Univ, New Haven, CT 06520 USA
[2] Fred Hutchinson Canc Res Ctr, Seattle, WA 98109 USA
基金
美国国家卫生研究院;
关键词
boosting; accelerated failure time model; Buckley-James estimator; censored survival data; LASSO; variable selection; HIGH-ORDER INTERACTIONS; PARTIAL LEAST-SQUARES; REGULARIZED ESTIMATION; MICROARRAY DATA; REGRESSION; FAILURE; MODELS; GENES; REDUCTION; SELECTION;
D O I
10.2202/1544-6115.1550
中图分类号
Q5 [生物化学]; Q7 [分子生物学];
学科分类号
071010 ; 081704 ;
摘要
There has been increasing interest in predicting patients' survival after therapy by investigating gene expression microarray data. In the regression and classification models with high-dimensional genomic data, boosting has been successfully applied to build accurate predictive models and conduct variable selection simultaneously. We propose the Buckley-James boosting for the semiparametric accelerated failure time models with right censored survival data, which can be used to predict survival of future patients using the high-dimensional genomic data. In the spirit of adaptive LASSO, twin boosting is also incorporated to fit more sparse models. The proposed methods have a unified approach to fit linear models, non-linear effects models with possible interactions. The methods can perform variable selection and parameter estimation simultaneously. The proposed methods are evaluated by simulations and applied to a recent microarray gene expression data set for patients with diffuse large B-cell lymphoma under the current gold standard therapy.
引用
收藏
页数:33
相关论文
共 71 条
[11]  
Bühlmann P, 2006, J MACH LEARN RES, V7, P1001
[12]   Twin Boosting: improved feature selection and prediction [J].
Buehlmann, Peter ;
Hothorn, Torsten .
STATISTICS AND COMPUTING, 2010, 20 (02) :119-138
[13]   Boosting with the L2 loss:: Regression and classification [J].
Bühlmann, P ;
Yu, B .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2003, 98 (462) :324-339
[14]   Boosting for high-dimensional linear models [J].
Buhlmann, Peter .
ANNALS OF STATISTICS, 2006, 34 (02) :559-583
[15]   Regularized Estimation for the Accelerated Failure Time Model [J].
Cai, T. ;
Huang, J. ;
Tian, L. .
BIOMETRICS, 2009, 65 (02) :394-404
[16]   Forecasting newspaper demand with censored regression [J].
Calli, M. Kiygi ;
Weverbergh, M. .
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2009, 60 (07) :944-951
[17]   Predicting patient survival from microarray data by accelerated failure time modeling using partial least squares and LASSO [J].
Datta, Susmita ;
Le-Rademacher, Jennifer ;
Datta, Somnath .
BIOMETRICS, 2007, 63 (01) :259-271
[18]   STATISTICAL-MODELS FOR ZERO EXPENDITURES IN HOUSEHOLD BUDGETS [J].
DEATON, A ;
IRISH, M .
JOURNAL OF PUBLIC ECONOMICS, 1984, 23 (1-2) :59-80
[19]   Least angle regression - Rejoinder [J].
Efron, B ;
Hastie, T ;
Johnstone, I ;
Tibshirani, R .
ANNALS OF STATISTICS, 2004, 32 (02) :494-499
[20]   A working guide to boosted regression trees [J].
Elith, J. ;
Leathwick, J. R. ;
Hastie, T. .
JOURNAL OF ANIMAL ECOLOGY, 2008, 77 (04) :802-813