Independent screening for single-index hazard rate models with ultrahigh dimensional features

被引:81
作者
Gorst-Rasmussen, Anders [1 ]
Scheike, Thomas [2 ]
机构
[1] Aalborg Univ, DK-9220 Aalborg, Denmark
[2] Univ Copenhagen, DK-1168 Copenhagen, Denmark
关键词
Additive hazards model; Independent screening; Survival data; Ultrahigh dimension; Variable selection; NONCONCAVE PENALIZED LIKELIHOOD; VARIABLE SELECTION; REGRESSION; LASSO; INEQUALITIES; LINEARITY;
D O I
10.1111/j.1467-9868.2012.01039.x
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
. In data sets with many more features than observations, independent screening based on all univariate regression models leads to a computationally convenient variable selection method. Recent efforts have shown that, in the case of generalized linear models, independent screening may suffice to capture all relevant features with high probability, even in ultrahigh dimension. It is unclear whether this formal sure screening property is attainable when the response is a right-censored survival time. We propose a computationally very efficient independent screening method for survival data which can be viewed as the natural survival equivalent of correlation screening. We state conditions under which the method admits the sure screening property within a class of single-index hazard rate models with ultrahigh dimensional features and describe the generally detrimental effect of censoring on performance. An iterative variant of the method is also described which combines screening with penalized regression to handle more complex feature covariance structures. The methodology is evaluated through simulation studies and through application to a real gene expression data set.
引用
收藏
页码:217 / 245
页数:29
相关论文
共 48 条
[1]   A LINEAR-REGRESSION MODEL FOR THE ANALYSIS OF LIFE TIMES [J].
AALEN, OO .
STATISTICS IN MEDICINE, 1989, 8 (08) :907-925
[2]   Semi-supervised methods to predict patient survival from gene expression data [J].
Bair, E ;
Tibshirani, R .
PLOS BIOLOGY, 2004, 2 (04) :511-522
[3]   High-Dimensional Cox Models: The Choice of Penalty as Part of the Model Building Process [J].
Benner, Axel ;
Zucknick, Manuela ;
Hielscher, Thomas ;
Ittrich, Carina ;
Mansmann, Ulrich .
BIOMETRICAL JOURNAL, 2010, 52 (01) :50-69
[4]   REGULARIZATION FOR COX'S PROPORTIONAL HAZARDS MODEL WITH NP-DIMENSIONALITY [J].
Bradic, Jelena ;
Fan, Jianqing ;
Jiang, Jiancheng .
ANNALS OF STATISTICS, 2011, 39 (06) :3092-3120
[5]   Penalized composite quasi-likelihood for ultrahigh dimensional variable selection [J].
Bradic, Jelena ;
Fan, Jianqing ;
Wang, Weiwei .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2011, 73 :325-349
[6]   Statistical modeling: The two cultures [J].
Breiman, L .
STATISTICAL SCIENCE, 2001, 16 (03) :199-215
[7]  
Brillinger D. R., 1983, A Festschrift for Erich L. Lehmann in Honor of His Sixty-Fifth Birthday, P97
[8]   ADJUSTED LEAST-SQUARES ESTIMATES FOR THE SCALED REGRESSION-COEFFICIENTS WITH CENSORED-DATA [J].
CHENG, KF ;
WU, JW .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1994, 89 (428) :1483-1491
[9]  
Fan J, 2010, BORROWING STRENGTH T
[10]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883