Variable selection in double/debiased machine learning for causal inference: an outcome-adaptive approach

被引:4
作者
Kabata, Daijiro [1 ,2 ]
Shintani, Mototsugu [3 ]
机构
[1] Univ Tokyo, Grad Sch Engn, Dept Adv Interdisciplinary Studies, Tokyo, Japan
[2] Osaka City Univ, Grad Sch Med, Dept Med Stat, Osaka, Japan
[3] Univ Tokyo, Fac Econ, Tokyo, Japan
关键词
causal inference; Double; debiased machine learning; High-dimensional data; Machine learning; Outcome-adaptive lasso; PROPENSITY SCORE ESTIMATION; SUPPORT; LASSO;
D O I
10.1080/03610918.2021.2001655
中图分类号
O21 [概率论与数理统计]; C8 [统计学];
学科分类号
020208 ; 070103 ; 0714 ;
摘要
Access to high-dimensional data has made the use of machine learning in causal inference more common in recent years. The double/debiased machine learning (DML) estimator for the treatment effect is designed to obtain a valid inference when nuisance functions in the treatment and outcome equations, are estimated using machine learning methods. However, when some covariates in the treatment equation do not appear in the outcome equation, the inclusion of such covariates in the propensity score estimation will result in the increasing bias and variance of the DML estimator. To solve this issue, we introduce an outcome-adaptive DML estimator, which incorporates the outcome-adaptive lasso for the variable selection in the propensity score estimation. We evaluate the performance of the proposed method using Monte Carlo simulation. The results indicate that our proposed method in many cases outperforms other methods.
引用
收藏
页码:5880 / 5893
页数:14
相关论文
共 28 条
[1]  
Athey S., 2018, EC ARTIFICIAL INTELL, P507, DOI DOI 10.7208/CHICAGO/9780226613475.003.0021
[2]   Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies [J].
Austin, Peter C. ;
Stuart, Elizabeth A. .
STATISTICS IN MEDICINE, 2015, 34 (28) :3661-3679
[3]   Doubly robust estimation in missing data and causal inference models [J].
Bang, H .
BIOMETRICS, 2005, 61 (04) :962-972
[4]  
Bhattacharya Jay., 2007, Do instrumental variables belong in propensity scores? : National Bureau of Economic Research Cambridge, DOI DOI 10.3386/T0343
[5]   Variable selection for propensity score models [J].
Brookhart, M. Alan ;
Schneeweiss, Sebastian ;
Rothman, Kenneth J. ;
Glynn, Robert J. ;
Avorn, Jerry ;
Sturmer, Til .
AMERICAN JOURNAL OF EPIDEMIOLOGY, 2006, 163 (12) :1149-1156
[6]   Too many covariates and too few cases? - a comparative study [J].
Chen, Qingxia ;
Nian, Hui ;
Zhu, Yuwei ;
Talbot, H. Keipp ;
Griffin, Marie R. ;
Harrell, Frank E., Jr. .
STATISTICS IN MEDICINE, 2016, 35 (25) :4546-4558
[7]   Double/debiased machine learning for treatment and structural parameters [J].
Chernozhukov, Victor ;
Chetverikov, Denis ;
Demirer, Mert ;
Duflo, Esther ;
Hansen, Christian ;
Newey, Whitney ;
Robins, James .
ECONOMETRICS JOURNAL, 2018, 21 (01) :C1-C68
[8]   MACHINE LEARNING IN ECONOMETRICS Double/Debiased/Neyman Machine Learning of Treatment Effects [J].
Chernozhukov, Victor ;
Chetverikov, Denis ;
Demirer, Mert ;
Duflo, Esther ;
Hansen, Christian ;
Newey, Whitney .
AMERICAN ECONOMIC REVIEW, 2017, 107 (05) :261-265
[9]   The effectiveness of right heart catheterization in the initial care of critically ill patients [J].
Connors, AF ;
Speroff, T ;
Dawson, NV ;
Thomas, C ;
Harrell, FE ;
Wagner, D ;
Desbiens, N ;
Goldman, L ;
Wu, AW ;
Califf, RM ;
Fulkerson, WJ ;
Vidaillet, H ;
Broste, S ;
Bellamy, P ;
Lynn, J ;
Knaus, WA .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1996, 276 (11) :889-897
[10]   Variable selection and forecasting via automated methods for linear models: LASSO/adaLASSO and Autometrics [J].
Epprecht, Camila ;
Guegan, Dominique ;
Veiga, Alvaro ;
da Rosa, Joel Correa .
COMMUNICATIONS IN STATISTICS-SIMULATION AND COMPUTATION, 2021, 50 (01) :103-122