Penalized regression procedures for variable selection in the potential outcomes framework

被引:28
作者
Ghosh, Debashis [1 ]
Zhu, Yeying [2 ]
Coffman, Donna L. [3 ]
机构
[1] Colorado Sch Publ Hlth, Dept Biostat & Informat, Aurora, CO 80045 USA
[2] Univ Waterloo, Dept Stat & Actuarial Sci, Waterloo, ON N2L 3G1, Canada
[3] Penn State Univ, Methodol Ctr, University Pk, PA 16802 USA
基金
美国国家卫生研究院;
关键词
average causal effect; counterfactual; imputed data; L-1; penalty; treatment heterogeneity; MARGINAL STRUCTURAL MODELS; PROPENSITY SCORE; CAUSAL INFERENCE; LINEAR-MODELS; IMPUTATION;
D O I
10.1002/sim.6433
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A recent topic of much interest in causal inference is model selection. In this article, we describe a framework in which to consider penalized regression approaches to variable selection for causal effects. The framework leads to a simple impute, then select' class of procedures that is agnostic to the type of imputation algorithm as well as penalized regression used. It also clarifies how model selection involves a multivariate regression model for causal inference problems and that these methods can be applied for identifying subgroups in which treatment effects are homogeneous. Analogies and links with the literature on machine learning methods, missing data, and imputation are drawn. A difference least absolute shrinkage and selection operator algorithm is defined, along with its multiple imputation analogs. The procedures are illustrated using a well-known right-heart catheterization dataset. Copyright (c) 2015 John Wiley & Sons, Ltd.
引用
收藏
页码:1645 / 1658
页数:14
相关论文
共 43 条
  • [1] Doubly robust estimation in missing data and causal inference models
    Bang, H
    [J]. BIOMETRICS, 2005, 61 (04) : 962 - 972
  • [2] Biau G, 2008, J MACH LEARN RES, V9, P2015
  • [3] Random forests
    Breiman, L
    [J]. MACHINE LEARNING, 2001, 45 (01) : 5 - 32
  • [4] Variable selection for propensity score models
    Brookhart, M. Alan
    Schneeweiss, Sebastian
    Rothman, Kenneth J.
    Glynn, Robert J.
    Avorn, Jerry
    Sturmer, Til
    [J]. AMERICAN JOURNAL OF EPIDEMIOLOGY, 2006, 163 (12) : 1149 - 1156
  • [5] A semiparametric model selection criterion with applications to the marginal structural model
    Brookhart, MA
    van der Laan, MJ
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2006, 50 (02) : 475 - 498
  • [6] Variable selection in high-dimensional linear models: partially faithful distributions and the PC-simple algorithm
    Buehlmann, P.
    Kalisch, M.
    Maathuis, M. H.
    [J]. BIOMETRIKA, 2010, 97 (02) : 261 - 278
  • [7] Bootstrapping Lasso Estimators
    Chatterjee, A.
    Lahiri, S. N.
    [J]. JOURNAL OF THE AMERICAN STATISTICAL ASSOCIATION, 2011, 106 (494) : 608 - 625
  • [8] Variable selection for multiply-imputed data with application to dioxin exposure study
    Chen, Qixuan
    Wang, Sijian
    [J]. STATISTICS IN MEDICINE, 2013, 32 (21) : 3646 - 3659
  • [9] The effectiveness of right heart catheterization in the initial care of critically ill patients
    Connors, AF
    Speroff, T
    Dawson, NV
    Thomas, C
    Harrell, FE
    Wagner, D
    Desbiens, N
    Goldman, L
    Wu, AW
    Califf, RM
    Fulkerson, WJ
    Vidaillet, H
    Broste, S
    Bellamy, P
    Lynn, J
    Knaus, WA
    [J]. JAMA-JOURNAL OF THE AMERICAN MEDICAL ASSOCIATION, 1996, 276 (11): : 889 - 897
  • [10] Adjustment uncertainty in effect estimation
    Crainiceanu, Ciprian M.
    Dominici, Francesca
    Parmigiani, Giovanni
    [J]. BIOMETRIKA, 2008, 95 (03) : 635 - 651