Conditional screening for ultrahigh-dimensional survival data in case-cohort studies

被引:1
作者
Zhang, Jing [1 ]
Zhou, Haibo [2 ]
Liu, Yanyan [3 ]
Cai, Jianwen [2 ]
机构
[1] Zhongnan Univ Econ & Law, Sch Stat & Math, Wuhan 430073, Peoples R China
[2] Univ N Carolina, Dept Biostat, Chapel Hill, NC 27599 USA
[3] Wuhan Univ, Sch Math & Stat, Wuhan 430072, Peoples R China
基金
中国国家自然科学基金;
关键词
Case-cohort design; Conditional screening; Sure screening property; Survival data; Ultrahigh-dimensional data; Weighted estimating equation; VARIABLE SELECTION; REGRESSION-MODEL; LIKELIHOOD; EFFICIENCY;
D O I
10.1007/s10985-021-09531-7
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
The case-cohort design has been widely used to reduce the cost of covariate measurements in large cohort studies. In many such studies, the number of covariates is very large, and the goal of the research is to identify active covariates which have great influence on response. Since the introduction of sure independence screening, screening procedures have achieved great success in terms of effectively reducing the dimensionality and identifying active covariates. However, commonly used screening methods are based on marginal correlation or its variants, they may fail to identify hidden active variables which are jointly important but are weakly correlated with the response. Moreover, these screening methods are mainly proposed for data under the simple random sampling and can not be directly applied to case-cohort data. In this paper, we consider the ultrahigh-dimensional survival data under the case-cohort design, and propose a conditional screening method by incorporating some important prior known information of active variables. This method can effectively detect hidden active variables. Furthermore, it possesses the sure screening property under some mild regularity conditions and does not require any complicated numerical optimization. We evaluate the finite sample performance of the proposed method via extensive simulation studies and further illustrate the new approach through a real data set from patients with breast cancer.
引用
收藏
页码:632 / 661
页数:30
相关论文
共 63 条
[1]   COX REGRESSION-MODEL FOR COUNTING-PROCESSES - A LARGE SAMPLE STUDY [J].
ANDERSEN, PK ;
GILL, RD .
ANNALS OF STATISTICS, 1982, 10 (04) :1100-1120
[2]  
[Anonymous], 1991, Counting Processes and Survival Analysis
[3]   ROBUST VARIANCE-ESTIMATION FOR THE CASE-COHORT DESIGN [J].
BARLOW, WE .
BIOMETRICS, 1994, 50 (04) :1064-1072
[4]   Conditional Sure Independence Screening [J].
Barut, Emre ;
Fan, Jianqing ;
Verhasselt, Anneleen .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2016, 111 (515) :1266-1277
[5]   Exposure stratified case-cohort designs [J].
Borgan, O ;
Langholz, B ;
Samuelsen, SO ;
Goldstein, L ;
Pogoda, J .
LIFETIME DATA ANALYSIS, 2000, 6 (01) :39-58
[6]   Weighted likelihood for semiparametric models and two-phase stratified samples, with application to cox regression [J].
Breslow, Norman E. ;
Wellner, Jon A. .
SCANDINAVIAN JOURNAL OF STATISTICS, 2007, 34 (01) :86-102
[7]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[8]   MARGINAL EMPIRICAL LIKELIHOOD AND SURE INDEPENDENCE FEATURE SCREENING [J].
Chang, Jinyuan ;
Tang, Cheng Yong ;
Wu, Yichao .
ANNALS OF STATISTICS, 2013, 41 (04) :2123-2148
[9]   Case-cohort and case-control analysis with Cox's model [J].
Chen, K ;
Lo, SH .
BIOMETRIKA, 1999, 86 (04) :755-764
[10]   Generalized case-cohort sampling [J].
Chen, KN .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2001, 63 :791-809