Combining Multiple Observational Data Sources to Estimate Causal Effects

被引：39

作者：

Yang, Shu ^{[1
]}

Ding, Peng ^{[2
]}

机构：

[1] North Carolina State Univ, Dept Stat, 2311 Stinson Dr Campus Box 8203, Raleigh, NC 27695 USA

[2] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA

来源：

JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION | 2020年 / 115卷 / 531期

关键词：

Calibration; Causal inference; Inverse probability weighting; Missing confounder; Two-phase sampling; PROPENSITY SCORE CALIBRATION; DOUBLY ROBUST ESTIMATION; LARGE-SAMPLE PROPERTIES; AUXILIARY INFORMATION; MISSING CONFOUNDERS; MATCHING ESTIMATORS; VALIDATION DATA; REGRESSION; INFERENCE; 2-PHASE;

D O I：

10.1080/01621459.2019.1609973

中图分类号：

O21 [概率论与数理统计]; C8 [统计学];

学科分类号：

020208 ; 070103 ; 0714 ;

摘要：

The era of big data has witnessed an increasing availability of multiple data sources for statistical analyses. We consider estimation of causal effects combining big main data with unmeasured confounders and smaller validation data withon these confounders. Under the unconfoundedness assumption with completely observed confounders, the smaller validation data allow for constructing consistent estimators for causal effects, but the big main data can only give error-prone estimators in general. However, by leveraging the information in the big main data in a principled way, we can improve the estimation efficiencies yet preserve the consistencies of the initial estimators based solely on the validation data. Our framework applies to asymptotically normal estimators, including the commonly used regression imputation, weighting, and matching estimators, and does not require a correct specification of the model relating the unmeasured confounders to the observed variables. We also propose appropriate bootstrap procedures, which makes our method straightforward to implement using software routines for existing estimators.for this article are available online.

引用

页码：1540 / 1554

页数：15

共 50 条

[1] Combining machine learning and propensity score weighting to estimate causal effects in multivalued treatments
Linden, Ariel
Yarnold, Paul R.
JOURNAL OF EVALUATION IN CLINICAL PRACTICE, 2016, 22 (06) : 871 - 881
[2] Borrowing from supplemental sources to estimate causal effects from a primary data source
Boatman, Jeffrey A.
Vock, David M.
Koopmeiners, Joseph S.
STATISTICS IN MEDICINE, 2021, 40 (24) : 5115 - 5130
[3] Causal inference with observational data
Nichols, Austin
STATA JOURNAL, 2007, 7 (04): : 507 - 541
[4] The Role of Sample Size to Attain Statistically Comparable Groups - A Required Data Preprocessing Step to Estimate Causal Effects With Observational Data
Kolar, Ana
Steiner, Peter M.
EVALUATION REVIEW, 2021, : 166 - 190
[5] Learning Individual Causal Effects from Networked Observational Data
Guo, Ruocheng
Li, Jundong
Liu, Huan
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM '20), 2020, : 232 - 240
[6] The estimation of causal effects from observational data
Winship, C
Morgan, SL
ANNUAL REVIEW OF SOCIOLOGY, 1999, 25 : 659 - 706
[7] Estimation of causal effects of multiple treatments in observational studies with a binary outcome
Hu, Liangyuan
Gu, Chenyang
Lopez, Michael
Ji, Jiayi
Wisnivesky, Juan
STATISTICAL METHODS IN MEDICAL RESEARCH, 2020, 29 (11) : 3218 - 3234
[8] Tracing Causal Paths from Experimental and Observational Data
Zhou, Xiang
Yamamoto, Teppei
JOURNAL OF POLITICS, 2023, : 250 - 265
[9] Generalization Bound for Estimating Causal Effects from Observational Network Data
Cai, Ruichu
Yang, Zeqin
Chen, Weilin
Yan, Yuguang
Hao, Zhifeng
PROCEEDINGS OF THE 32ND ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2023, 2023, : 163 - 172
[10] Estimating Causal Effects on Networked Observational Data via Representation Learning
Jiang, Song
Sun, Yizhou
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022, 2022, : 852 - 861

← 1 2 3 4 5 →