Nonparametric regression with missing data

被引:2
作者
Efromovich, Sam [1 ]
机构
[1] Univ Texas Dallas, Dept Math Sci, Richardson, TX 75083 USA
基金
美国国家科学基金会;
关键词
adaptation; missing predictor; missing response; optimality;
D O I
10.1002/wics.1303
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Optimal estimation of a regression function, when either the response or the predictor may be missed at random, is considered. Missing at random (MAR) means that the conditional probability of missing, given response and predictor, does not depend on a variable whose values may be missed. Mean integrated squared error (MISE) is the used statistical criteria, and a nonparametric approach implies that no assumption about shape of the regression function is made. It is shown that optimal estimation depends on which variable, the response or the predictor, is missed. For a setting with missed responses, optimal estimation is based only on complete cases of observations and incomplete ones can be ignored. For a setting with missed predictors, optimal estimation is based on all cases, both complete and incomplete, and the procedure includes estimation of the conditional probability of missing the predictor given the response. Proposed estimators are completely data-driven, do not involve imputation of missing values, and adapt to missing mechanism and smoothness of an estimated regression function. Theoretical results are complemented by the analysis of a credit score survey data. (C) 2014 Wiley Periodicals, Inc.
引用
收藏
页码:265 / 275
页数:11
相关论文
共 18 条
[1]   Local Post-Stratification in Dual System Accuracy and Coverage Evaluation for the US Census [J].
Chen, Song Xi ;
Tang, Cheng Yong ;
Mule, Vincent T., Jr. .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2010, 105 (489) :105-119
[2]  
Davey A., 2009, STAT POWER ANAL MISS
[3]  
Edelman D., 2008, SIGNIFICANCE, V5, P59
[4]  
Efromovich S, 1996, STAT SINICA, V6, P925
[5]  
Efromovich S., 1999, NONPARAMETRIC CURVE
[6]   Nonparametric regression with responses missing at random [J].
Efromovich, Sam .
JOURNAL OF STATISTICAL PLANNING AND INFERENCE, 2011, 141 (12) :3744-3752
[7]   Nonparametric Regression With Predictors Missing at Random [J].
Efromovich, Sam .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (493) :306-319
[8]  
Enders C.K., 2010, APPL MISSING DATA AN
[9]   Effects of missing data in credit risk scoring. A comparative analysis of methods to achieve robustness in the absence of sufficient data [J].
Florez-Lopez, R. .
JOURNAL OF THE OPERATIONAL RESEARCH SOCIETY, 2010, 61 (03) :486-501
[10]  
HONAKER JAMES, 2010, AMELIA 2 PROGRAM MIS, VII