Regression adjustment for treatment effect with multicollinearity in high dimensions

被引:11
作者
Yue, Lili [1 ,2 ]
Li, Gaorong [2 ]
Lian, Heng [3 ]
Wan, Xiang [4 ]
机构
[1] Beijing Univ Technol, Coll Appl Sci, Beijing 100124, Peoples R China
[2] Beijing Univ Technol, Beijing Inst Sci & Engn Comp, Beijing 100124, Peoples R China
[3] City Univ Hong Kong, Dept Math, Hong Kong, Peoples R China
[4] Shenzhen Res Inst Big Data, Shenzhen 518172, Peoples R China
基金
北京市自然科学基金; 中国国家自然科学基金;
关键词
Average Treatment Effect; Causal inference; Elastic-net; High-dimensional data; Randomized experiments; Rubin causal model; CAUSAL INFERENCE; VARIABLE SELECTION; MODEL SELECTION; ELASTIC-NET; SHRINKAGE;
D O I
10.1016/j.csda.2018.11.002
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Randomized experiment is an important tool for studying the Average Treatment Effect (ATE). This paper considers the regression adjustment estimation of the Sample Average Treatment Effect (SATE) in high-dimensional case, where the multicollinearity problem is often encountered and needs to be properly handled. Many existing regression adjustment methods fail to achieve satisfactory performances. To solve this issue, an Elastic-net adjusted estimator for SATE is proposed under the Rubin causal model of randomized experiments with multicollinearity in high dimensions. The asymptotic properties of the proposed SATE estimator are shown under some regularity conditions, and the asymptotic variance is proved to be not greater than that of the unadjusted estimator. Furthermore, Neyman-type conservative estimators for the asymptotic variance are proposed, which yields tighter confidence intervals than both the unadjusted and the Lasso-based adjusted estimators. Some simulation studies are carried out to show that the Elastic-net adjusted method is better in addressing collinearity problem than the existing methods. The advantages of our proposed method are also shown in analyzing the dataset of HER2 breast cancer patients. (C) 2018 Elsevier B.V. All rights reserved.
引用
收藏
页码:17 / 35
页数:19
相关论文
共 31 条
  • [1] [Anonymous], 1990, STAT SCI
  • [2] PROGRAM EVALUATION AND CAUSAL INFERENCE WITH HIGH-DIMENSIONAL DATA
    Belloni, A.
    Chernozhukov, V.
    Fernandez-Val, I.
    Hansen, C.
    [J]. ECONOMETRICA, 2017, 85 (01) : 233 - 298
  • [3] Inference on Treatment Effects after Selection among High-Dimensional ControlsaEuro
    Belloni, Alexandre
    Chernozhukov, Victor
    Hansen, Christian
    [J]. REVIEW OF ECONOMIC STUDIES, 2014, 81 (02) : 608 - 650
  • [4] Lasso adjustments of treatment effect estimates in randomized experiments
    Bloniarz, Adam
    Liu, Hanzhong
    Zhang, Cun-Hui
    Sekhon, Jasjeet S.
    Yu, Bin
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2016, 113 (27) : 7383 - 7390
  • [5] Simultaneous regression shrinkage, variable selection, and supervised clustering of predictors with OSCAR
    Bondell, Howard D.
    Reich, Brian J.
    [J]. BIOMETRICS, 2008, 64 (01) : 115 - 123
  • [6] Shrinkage and model selection with correlated variables via weighted fusion
    Daye, Z. John
    Jeng, X. Jessie
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2009, 53 (04) : 1284 - 1298
  • [7] Dudoit S, 2009, J AM STAT ASSOC, V97, P77
  • [8] Gene Expression Omnibus: NCBI gene expression and hybridization array data repository
    Edgar, R
    Domrachev, M
    Lash, AE
    [J]. NUCLEIC ACIDS RESEARCH, 2002, 30 (01) : 207 - 210
  • [9] Sure independence screening for ultrahigh dimensional feature space
    Fan, Jianqing
    Lv, Jinchi
    [J]. JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2008, 70 : 849 - 883
  • [10] SURE INDEPENDENCE SCREENING IN GENERALIZED LINEAR MODELS WITH NP-DIMENSIONALITY
    Fan, Jianqing
    Song, Rui
    [J]. ANNALS OF STATISTICS, 2010, 38 (06) : 3567 - 3604