Variable selection in Propensity Score Adjustment to mitigate selection bias in online surveys

被引:0
作者
Ramón Ferri-García
María del Mar Rueda
机构
[1] University of Granada,Department of Statistics and Operations Research
来源
Statistical Papers | 2022年 / 63卷
关键词
Online surveys; Propensity Score Adjustment; Selection bias; Variable selection; Raking calibration;
D O I
暂无
中图分类号
学科分类号
摘要
The development of new survey data collection methods such as online surveys has been particularly advantageous for social studies in terms of reduced costs, immediacy and enhanced questionnaire possibilities. However, many such methods are strongly affected by selection bias, leading to unreliable estimates. Calibration and Propensity Score Adjustment (PSA) have been proposed as methods to remove selection bias in online nonprobability surveys. Calibration requires population totals to be known for the auxiliary variables used in the procedure, while PSA estimates the volunteering propensity of an individual using predictive modelling. The variables included in these models must be carefully selected in order to maximise the accuracy of the final estimates. This study presents an application, using synthetic and real data, of variable selection techniques developed for knowledge discovery in data to choose the best subset of variables for propensity estimation. We also compare the performance of PSA using different classification algorithms, after which calibration is applied. We also present an application of this methodology in a real-world situation, using it to obtain estimates of population parameters. The results obtained show that variable selection using appropriate methods can provide less biased and more efficient estimates than using all available covariates.
引用
收藏
页码:1829 / 1881
页数:52
相关论文
共 148 条
[1]  
Austin PC(2008)A critical appraisal of propensity score matching in the medical literature between 1996 and 2003 Stat Med 27 2037-2049
[2]  
Austin PC(2011)An introduction to propensity score methods for reducing the effects of confounding in observational studies Multivariate Behav Res 46 399-424
[3]  
Austin PC(2015)Moving towards best practice when using inverse probability of treatment weighting (IPTW) using the propensity score to estimate causal treatment effects in observational studies Stat Med 34 3661-3679
[4]  
Stuart EA(2010)Selection bias in web surveys Int Stat Rev 78 161-188
[5]  
Bethlehem J(2013)A review of feature selection methods on synthetic data Knowl Inf Syst 34 483-519
[6]  
Bolón-Canedo V(2018)Studying cannabis use behaviors with Facebook and web surveys: methods and insights JMIR Public Health Surv 4 e48-205
[7]  
Sánchez-Maroño N(2017)Model-assisted survey estimation with modern prediction techniques Stat Sci 32 190-32
[8]  
Alonso-Betanzos A(2001)Random forests Mach Learn 45 5-1156
[9]  
Borodovsky JT(2006)Variable selection for propensity score models Am J Epidemiol 163 1149-343
[10]  
Marsch LA(2018)Comparing inference methods for non-probability samples Int Stat Rev 86 322-14