Overlap in observational studies with high-dimensional covariates

被引:67
作者
D'Amour, Alexander [1 ,2 ]
Ding, Peng [1 ]
Feller, Avi [1 ]
Lei, Lihua [1 ,3 ]
Sekhon, Jasjeet [1 ]
机构
[1] Univ Calif Berkeley, Dept Stat, Evans Hall, Berkeley, CA 94720 USA
[2] Google Res, Cambridge, MA 02140 USA
[3] Stanford Univ, Dept Stat, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Causal inference; Overlap; Information theory; Curse of dimensionality; INSTRUMENTAL VARIABLES; PROPENSITY SCORE; INFERENCE; BIAS; IDENTIFICATION;
D O I
10.1016/j.jeconom.2019.10.014
中图分类号
F [经济];
学科分类号
02 ;
摘要
Estimating causal effects under exogeneity hinges on two key assumptions: unconfoundedness and overlap. Researchers often argue that unconfoundedness is more plausible when more covariates are included in the analysis. Less discussed is the fact that covariate overlap is more difficult to satisfy in this setting. In this paper, we explore the implications of overlap in observational studies with high-dimensional covariates and formalize curse-of-dimensionality argument, suggesting that these assumptions are stronger than investigators likely realize. Our key innovation is to explore how strict overlap restricts global discrepancies between the covariate distributions in the treated and control populations. Exploiting results from information theory, we derive explicit bounds on the average imbalance in covariate means under strict overlap and show that these bounds become more restrictive as the dimension grows large. We discuss how these implications interact with assumptions and procedures commonly deployed in observational causal inference, including sparsity and trimming. (C) 2020 The Authors. Published by Elsevier B.V.
引用
收藏
页码:644 / 654
页数:11
相关论文
共 52 条
[1]  
ALI SM, 1966, J ROY STAT SOC B, V28, P131
[2]   Maximum likelihood estimation and uniform inference with sporadic identification failure [J].
Andrews, Donald W. K. ;
Cheng, Xu .
JOURNAL OF ECONOMETRICS, 2013, 173 (01) :36-56
[3]   Estimation and Inference With Weak, Semi-Strong, and Strong Identification [J].
Andrews, Donald W. K. ;
Cheng, Xu .
ECONOMETRICA, 2012, 80 (05) :2153-2211
[4]  
[Anonymous], 2010, P 26 C UNCERTAINTY A
[5]  
[Anonymous], 2006, Technical report
[6]  
[Anonymous], 2002, Observational studies, DOI DOI 10.1007/978-1-4757-3692-2_3
[7]  
Armstrong, 2018, ARXIV PREPRINT ARXIV
[8]   Machine Learning Methods That Economists Should Know About [J].
Athey, Susan ;
Imbens, Guido W. .
ANNUAL REVIEW OF ECONOMICS, VOL 11, 2019, 2019, 11 :685-725
[9]   Approximate residual balancing: debiased inference of average treatment effects in high dimensions [J].
Athey, Susan ;
Imbens, Guido W. ;
Wager, Stefan .
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2018, 80 (04) :597-623
[10]   High-Dimensional Methods and Inference on Structural and Treatment Effects [J].
Belloni, Alexandre ;
Chernozhukov, Victor ;
Hansen, Christian .
JOURNAL OF ECONOMIC PERSPECTIVES, 2014, 28 (02) :29-50