How to Make Model-free Feature Screening Approaches for Full Data Applicable to the Case of Missing Response?

被引:14
作者
Wang, Qihua [1 ,2 ]
Li, Yongjin [1 ]
机构
[1] Chinese Acad Sci, Acad Math & Syst Sci, Beijing 100190, Peoples R China
[2] Shenzhen Univ, Inst Stat Sci, Shenzhen, Peoples R China
基金
中国国家自然科学基金;
关键词
borrowing missingness information; missing data; ultrahigh dimensionality; variable screening; GENERALIZED LINEAR-MODELS; VARIABLE SELECTION; EMPIRICAL LIKELIHOOD; ORACLE PROPERTIES; ALGORITHM; LASSO;
D O I
10.1111/sjos.12290
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
It is quite a challenge to develop model-free feature screening approaches for missing response problems because the existing standard missing data analysis methods cannot be applied directly to high dimensional case. This paper develops some novel methods by borrowing information of missingness indicators such that any feature screening procedures for ultrahigh-dimensional covariates with full data can be applied to missing response case. The first method is the so-called missing indicator imputation screening, which is developed by proving that the set of the active predictors of interest for the response is a subset of the active predictors for the product of the response and missingness indicator under some mild conditions. As an alternative, another method called Venn diagram-based approach is also developed. The sure screening property is proven for both methods. It is shown that the complete case analysis can also keep the sure screening property of any feature screening approach with sure screening property.
引用
收藏
页码:324 / 346
页数:23
相关论文
共 33 条
[1]  
Candes E, 2007, ANN STAT, V35, P2313, DOI 10.1214/009053606000001523
[3]   Fusion-Refinement Procedure for Dimension Reduction With Missing Response at Random [J].
Ding, Xiaobo ;
Wang, Qihua .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (495) :1193-1207
[4]   Sure independence screening for ultrahigh dimensional feature space [J].
Fan, Jianqing ;
Lv, Jinchi .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 :849-883
[5]   Nonparametric Independence Screening in Sparse Ultra-High-Dimensional Additive Models [J].
Fan, Jianqing ;
Feng, Yang ;
Song, Rui .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) :544-557
[6]   SURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY [J].
Fan, Jianqing ;
Song, Rui .
ANNALS OF STATISTICS, 2010, 38 (06) :3567-3604
[7]  
Fan JQ, 2009, J MACH LEARN RES, V10, P2013
[8]   Variable selection via nonconcave penalized likelihood and its oracle properties [J].
Fan, JQ ;
Li, RZ .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2001, 96 (456) :1348-1360
[9]  
Garcia RI, 2010, STAT SINICA, V20, P149
[10]   Using Generalized Correlation to Effect Variable Selection in Very High Dimensional Problems [J].
Hall, Peter ;
Miller, Hugh .
JOURNAL OF COMPUTATIONAL AND GRAPHICAL STATISTICS, 2009, 18 (03) :533-550