A comparison of full model specification and backward elimination of potential confounders when estimating marginal and conditional causal effects on binary outcomes from observational data

被引:4
作者
Luijken, Kim [1 ]
Groenwold, Rolf H. H. [1 ,2 ]
van Smeden, Maarten [1 ,3 ]
Strohmaier, Susanne [4 ,5 ]
Heinze, Georg [4 ]
机构
[1] Leiden Univ, Dept Clin Epidemiol, Med Ctr, Leiden, Netherlands
[2] Leiden Univ, Dept Biomed Data Sci, Med Ctr, Leiden, Netherlands
[3] Univ Utrecht, Univ Med Ctr Utrecht, Julius Ctr Hlth Sci & Primary Care, Utrecht, Netherlands
[4] Med Univ Vienna, Ctr Med Stat Informat & Intelligent Syst, Sect Clin Biometr, Spitalgasse 23, A-1090 Vienna, Austria
[5] Med Univ Vienna, Ctr Publ Hlth, Dept Epidemiol, Vienna, Austria
基金
欧盟地平线“2020”; 奥地利科学基金会;
关键词
backward elimination; causal inference; confounder selection; RELATIVE RISKS; SELECTION; BIAS; INFERENCE; QUALITY;
D O I
10.1002/bimj.202100237
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
A common view in epidemiology is that automated confounder selection methods, such as backward elimination, should be avoided as they can lead to biased effect estimates and underestimation of their variance. Nevertheless, backward elimination remains regularly applied. We investigated if and under which conditions causal effect estimation in observational studies can improve by using backward elimination on a prespecified set of potential confounders. An expression was derived that quantifies how variable omission relates to bias and variance of effect estimators. Additionally, 3960 scenarios were defined and investigated by simulations comparing bias and mean squared error (MSE) of the conditional log odds ratio, log(cOR), and the marginal log risk ratio, log(mRR), between full models including all prespecified covariates and backward elimination of these covariates. Applying backward elimination resulted in a mean bias of 0.03 for log(cOR) and 0.02 for log(mRR), compared to 0.56 and 0.52 for log(cOR) and log(mRR), respectively, for a model without any covariate adjustment, and no bias for the full model. In less than 3% of the scenarios considered, the MSE of the log(cOR) or log(mRR) was slightly lower (max 3%) when backward elimination was used compared to the full model. When an initial set of potential confounders can be specified based on background knowledge, there is minimal added value of backward elimination. We advise not to use it and otherwise to provide ample arguments supporting its use.
引用
收藏
页数:14
相关论文
共 49 条
[1]   Reporting of covariate selection and balance assessment in propensity score analysis is suboptimal: a systematic review [J].
Ali, M. Sanni ;
Groenwold, Rolf H. H. ;
Belitser, Svetlana V. ;
Pestman, Wiebe R. ;
Hoes, Arno W. ;
Roes, Kit C. B. ;
de Boer, Anthonius ;
Klungel, Olaf H. .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2015, 68 (02) :122-131
[2]   Absolute risk reductions, relative risks, relative risk reductions, and numbers needed to treat can be obtained from a logistic regression model [J].
Austin, Peter C. .
JOURNAL OF CLINICAL EPIDEMIOLOGY, 2010, 63 (01) :2-6
[3]   Post-Selection Inference for Generalized Linear Models With Many Controls [J].
Belloni, Alexandre ;
Chernozhukov, Victor ;
Wei, Ying .
JOURNAL OF BUSINESS & ECONOMIC STATISTICS, 2016, 34 (04) :606-619
[4]   VALID POST-SELECTION INFERENCE [J].
Berk, Richard ;
Brown, Lawrence ;
Buja, Andreas ;
Zhang, Kai ;
Zhao, Linda .
ANNALS OF STATISTICS, 2013, 41 (02) :802-837
[5]  
CORDEIRO GM, 1991, J ROY STAT SOC B MET, V53, P629
[6]   To Adjust or Not to Adjust? Sensitivity Analysis of M-Bias and Butterfly-Bias [J].
Ding, Peng ;
Miratrix, Luke W. .
JOURNAL OF CAUSAL INFERENCE, 2015, 3 (01) :41-57
[7]   Augmented Backward Elimination: A Pragmatic and Purposeful Way to Develop Statistical Models [J].
Dunkler, Daniela ;
Plischke, Max ;
Leffondre, Karen ;
Heinze, Georg .
PLOS ONE, 2014, 9 (11)
[8]   Variable Selection in Causal Inference using a Simultaneous Penalization Method [J].
Ertefaie, Ashkan ;
Asgharian, Masoud ;
Stephens, David A. .
JOURNAL OF CAUSAL INFERENCE, 2018, 6 (01)
[9]  
FIRTH D, 1993, BIOMETRIKA, V80, P27, DOI 10.2307/2336755
[10]   Formulating causal questions and principled statistical answers [J].
Goetghebeur, Els ;
le Cessie, Saskia ;
De Stavola, Bianca ;
Moodie, Erica E. M. ;
Waernbaum, Ingeborg .
STATISTICS IN MEDICINE, 2020, 39 (30) :4922-4948