IDENTIFICATION AND INFERENCE WITH NONIGNORABLE MISSING COVARIATE DATA

被引:17
作者
Miao, Wang [1 ]
Tchetgen, Eric Tchetgen [2 ]
机构
[1] Peking Univ, Guanghua Sch Management, Beijing 100871, Peoples R China
[2] Harvard Univ, Dept Biostat, Boston, MA 02115 USA
关键词
Identification; missing covariate data; missing not at random; shadow variable; GENERALIZED LINEAR-MODELS; REGRESSION-ANALYSIS; LOGISTIC-REGRESSION; MAXIMUM-LIKELIHOOD; INCOMPLETE DATA; NONRESPONSE; VARIABLES; PATTERNS;
D O I
10.5705/ss.202016.0322
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
We study identification of parametric and semiparametric models with missing covariate data. When covariate data are missing not at random, identification is not guaranteed even under fairly restrictive parametric assumptions, a fact that is illustrated with several examples. We propose a general approach to establish identification of parametric and semiparametric models when a covariate is missing not at random. Without auxiliary information about the missingness process, identification of parametric models is strongly dependent on model specification. However, in the presence of a fully observed shadow variable that is correlated with the missing covariate but otherwise independent of the missingness conditional on the covariate, identification is more broadly achievable, including in fairly large semiparametric models. Special consideration is given to the generalized linear models with the missingness process unrestricted. Under such a setting, the outcome model is identified for a number of familiar generalized linear models, and we provide counterexamples when identification fails. For estimation, we describe an inverse probability weighted estimator that incorporates the shadow variable to estimate the propensity score model, and we evaluate its performance via simulations. We further illustrate the shadow variable approach with a data example about home prices in China.
引用
收藏
页码:2049 / 2067
页数:19
相关论文
共 46 条
[1]   MISSING OBSERVATIONS IN MULTIVARIATE STATISTICS .1. IEW OF LITERATURE [J].
AFIFI, AA ;
ELASHOFF, RM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1966, 61 (315) :595-&
[2]  
Angrist JD, 1996, J AM STAT ASSOC, V91, P444, DOI 10.2307/2291629
[3]  
[Anonymous], 1986, Handbook of Econometrics, DOI DOI 10.1016/S1573-4412(05)80005-4
[4]   REGRESSION-ANALYSIS FOR CATEGORICAL VARIABLES WITH OUTCOME SUBJECT TO NONIGNORABLE NONRESPONSE [J].
BAKER, SG ;
LAIRD, NM .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1988, 83 (401) :62-69
[5]   Improving upon the efficiency of complete case analysis when covariates are MNAR [J].
Bartlett, Jonathan W. ;
Carpenter, James R. ;
Tilling, Kate ;
Vansteelandt, Stijn .
BIOSTATISTICS, 2014, 15 (04) :719-730
[6]  
BUCK SF, 1960, J ROY STAT SOC B, V22, P302
[7]   A new instrumental method for dealing with endogenous selection [J].
d'Haultfoeuille, Xavier .
JOURNAL OF ECONOMETRICS, 2010, 154 (01) :1-15
[8]   MAXIMUM LIKELIHOOD FROM INCOMPLETE DATA VIA EM ALGORITHM [J].
DEMPSTER, AP ;
LAIRD, NM ;
RUBIN, DB .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-METHODOLOGICAL, 1977, 39 (01) :1-38
[10]   CAUSAL-MODELS FOR PATTERNS OF NONRESPONSE [J].
FAY, RE .
JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 1986, 81 (394) :354-365