Lasso adjustments of treatment effect estimates in randomized experiments

被引:106
作者
Bloniarz, Adam [1 ]
Liu, Hanzhong [1 ]
Zhang, Cun-Hui [2 ]
Sekhon, Jasjeet S. [1 ,3 ]
Yu, Bin [1 ,4 ]
机构
[1] Univ Calif Berkeley, Dept Stat, Berkeley, CA 94720 USA
[2] Rutgers State Univ, Dept Stat & Biostat, Piscataway, NJ 08854 USA
[3] Univ Calif Berkeley, Dept Polit Sci, Berkeley, CA 94720 USA
[4] Univ Calif Berkeley, Dept Elect Engn & Comp Sci, Berkeley, CA 94720 USA
基金
美国国家科学基金会;
关键词
randomized experiment; Neyman-Rubin model; average treatment effect; high-dimensional statistics; Lasso; REGRESSION ADJUSTMENTS; CARE;
D O I
10.1073/pnas.1510506113
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
We provide a principled way for investigators to analyze randomized experiments when the number of covariates is large. Investigators often use linear multivariate regression to analyze randomized experiments instead of simply reporting the difference of means between treatment and control groups. Their aim is to reduce the variance of the estimated treatment effect by adjusting for covariates. If there are a large number of covariates relative to the number of observations, regression may perform poorly because of overfitting. In such cases, the least absolute shrinkage and selection operator (Lasso) may be helpful. We study the resulting Lasso-based treatment effect estimator under the Neyman-Rubin model of randomized experiments. We present theoretical conditions that guarantee that the estimator is more efficient than the simple difference-of-means estimator, and we provide a conservative estimator of the asymptotic variance, which can yield tighter confidence intervals than the difference-of-means estimator. Simulation and data examples show that Lasso-based adjustment can be advantageous even when the number of covariates is less than the number of observations. Specifically, a variant using Lasso for selection and ordinary least squares (OLS) for estimation performs particularly well, and it chooses a smoothing parameter based on combined performance of Lasso and OLS.
引用
收藏
页码:7383 / 7390
页数:8
相关论文
共 21 条
[1]  
Belloni A, 2013, ARXIV13112645
[2]   Inference on Treatment Effects after Selection among High-Dimensional ControlsaEuro [J].
Belloni, Alexandre ;
Chernozhukov, Victor ;
Hansen, Christian .
REVIEW OF ECONOMIC STUDIES, 2014, 81 (02) :608-650
[3]  
Bühlmann P, 2011, SPRINGER SER STAT, P1, DOI 10.1007/978-3-642-20192-9
[4]   The effectiveness of right heart catheterization in the initial care of critically ill patients [J].
Connors, AF ;
Speroff, T ;
Dawson, NV ;
Thomas, C ;
Harrell, FE ;
Wagner, D ;
Desbiens, N ;
Goldman, L ;
Wu, AW ;
Califf, RM ;
Fulkerson, WJ ;
Vidaillet, H ;
Broste, S ;
Bellamy, P ;
Lynn, J ;
Knaus, WA .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1996, 276 (11) :889-897
[5]   The pulmonary artery catheter - Friend. foe, or accomplice? [J].
Dalen, JE .
JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 2001, 286 (03) :348-350
[6]   GENETIC MATCHING FOR ESTIMATING CAUSAL EFFECTS: A GENERAL MULTIVARIATE MATCHING METHOD FOR ACHIEVING BALANCE IN OBSERVATIONAL STUDIES [J].
Diamond, Alexis ;
Sekhon, Jasjeet S. .
REVIEW OF ECONOMICS AND STATISTICS, 2013, 95 (03) :932-945
[7]   ON REGRESSION ADJUSTMENTS IN EXPERIMENTS WITH SEVERAL TREATMENTS [J].
Freedman, David A. .
ANNALS OF APPLIED STATISTICS, 2008, 2 (01) :176-196
[8]   Randomization does not justify logistic regression [J].
Freedman, David A. .
STATISTICAL SCIENCE, 2008, 23 (02) :237-249
[9]   On regression adjustments to experimental data [J].
Freedman, David A. .
ADVANCES IN APPLIED MATHEMATICS, 2008, 40 (02) :180-193
[10]   Assessment of the clinical effectiveness of pulmonary artery catheters in management of patients in intensive care (PAC-Man): a randomised controlled trial [J].
Harvey, S ;
Harrison, DA ;
Singer, M ;
Ashcroft, J ;
Jones, CM ;
Elbourne, D ;
Brampton, W ;
Williams, D ;
Young, D ;
Rowan, K .
LANCET, 2005, 366 (9484) :472-477