Rank-based estimation in the l1-regularized partly linear model for censored outcomes with application to integrated analyses of clinical predictors and gene expression data

被引:19
作者
Johnson, Brent A. [1 ]
机构
[1] Emory Univ, Dept Biostat, Atlanta, GA 30322 USA
关键词
Lasso; Logrank; Penalized least squares; Survival analysis; FAILURE TIME MODEL; REGULARIZED ESTIMATION; VARIABLE SELECTION; REGRESSION; LASSO; TESTS;
D O I
10.1093/biostatistics/kxp020
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
We consider estimation and variable selection in the partial linear model for censored data. The partial linear model for censored data is a direct extension of the accelerated failure time model, the latter of which is a very important alternative model to the proportional hazards model. We extend rank-based lasso-type estimators to a model that may contain nonlinear effects. Variable selection in such partial linear model has direct application to high-dimensional survival analyses that attempt to adjust for clinical predictors. In the microarray setting, previous methods can adjust for other clinical predictors by assuming that clinical and gene expression data enter the model linearly in the same fashion. Here, we select important variables after adjusting for prognostic clinical variables but the clinical effects are assumed nonlinear. Our estimator is based on stratification and can be extended naturally to account for multiple nonlinear effects. We illustrate the utility of our method through simulation studies and application to the Wisconsin prognostic breast cancer data set.
引用
收藏
页码:659 / 666
页数:8
相关论文
共 17 条
[1]  
BLAKE CL, 1997, UCI REPOSITORY MACHI
[2]   Boosting algorithms: Regularization, prediction and model fitting [J].
Buehlmann, Peter ;
Hothorn, Torsten .
STATISTICAL SCIENCE, 2007, 22 (04) :477-505
[3]   Regularized Estimation for the Accelerated Failure Time Model [J].
Cai, T. ;
Huang, J. ;
Tian, L. .
BIOMETRICS, 2009, 65 (02) :394-404
[4]  
Chen KN, 2005, STAT SINICA, V15, P767
[5]   Predicting patient survival from microarray data by accelerated failure time modeling using partial least squares and LASSO [J].
Datta, Susmita ;
Le-Rademacher, Jennifer ;
Datta, Somnath .
BIOMETRICS, 2007, 63 (01) :259-271
[6]  
GEHAN EA, 1965, BIOMETRIKA, V52, P203, DOI 10.1093/biomet/52.1-2.203
[7]   Regularized estimation in the accelerated failure time model with high-dimensional covariates [J].
Huang, Jian ;
Ma, Shuangge ;
Xie, Huiliang .
BIOMETRICS, 2006, 62 (03) :813-820
[8]   Rank-based inference for the accelerated failure time model [J].
Jin, ZZ ;
Lin, DY ;
Wei, LJ ;
Ying, ZL .
BIOMETRIKA, 2003, 90 (02) :341-353
[9]   Penalized estimating functions and variable selection in semiparametric regression models [J].
Johnson, Brent A. ;
Lin, D. Y. ;
Zeng, Donglin .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2008, 103 (482) :672-680
[10]   Variable selection in semiparametric linear regression with censored data [J].
Johnson, Brent A. .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :351-370