Predicting Web Survey Breakoffs Using Machine Learning Models

被引:0
|
作者
Chen, Zeming [1 ]
Cernat, Alexandru [2 ]
Shlomo, Natalie [2 ]
机构
[1] Univ Manchester, Social Stat Dept, Manchester, Lancs, England
[2] Univ Manchester, Social Stat Dept, Social Stat, Manchester, Lancs, England
关键词
breakoff timing; time-varying variables; Cox model; LASSO Cox model; logistic regression; random forest; gradient boosting; support vector machine; RATES; TREE;
D O I
10.1177/08944393221112000
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Web surveys are becoming increasingly popular but tend to have more breakoffs compared to the interviewer-administered surveys. Survey breakoffs occur when respondents quit the survey partway through. The Cox survival model is commonly used to understand patterns of breakoffs. Nevertheless, there is a trend to using more data-driven models when the purpose is prediction, such as classification machine learning models. It is unclear in the breakoff literature what are the best statistical models for predicting question-level breakoffs. Additionally, there is no consensus about the treatment of time-varying question-level predictors, such as question response time and question word count. While some researchers use the current values, others aggregate the value from the beginning of the survey. This study develops and compares both survival models and classification models along with different treatments of time-varying variables. Based on the level of agreement between the predicted and actual breakoff, we find that the Cox model and gradient boosting outperform other survival models and classification models respectively. We also find that using the values of time-varying predictors concurrent to the breakoff status is more predictive of breakoff, compared to aggregating their values from the beginning of the survey, implying that respondents' breakoff behaviour is more driven by the current response burden.
引用
收藏
页码:573 / 591
页数:19
相关论文
共 50 条
  • [1] Impact of question topics and filter question formats on web survey breakoffs
    Chen, Zeming
    Cernat, Alexandru
    Shlomo, Natalie
    Eckman, Stephanie
    INTERNATIONAL JOURNAL OF MARKET RESEARCH, 2022, 64 (06) : 710 - 726
  • [2] Predicting student success in MOOCs: a comprehensive analysis using machine learning models
    Althibyani, Hosam A.
    PeerJ Computer Science, 2024, 10
  • [3] Predicting student success in MOOCs: a comprehensive analysis using machine learning models
    Althibyani, Hosam A.
    PEERJ COMPUTER SCIENCE, 2024, 10
  • [4] Selection of contributing factors for predicting landslide susceptibility using machine learning and deep learning models
    Chen, Cheng
    Fan, Lei
    STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT, 2023,
  • [5] Predicting the Loan Using Machine Learning
    Yamparala, Rajesh
    Saranya, Jonnakuti Raja
    Anusha, Papanaboina
    Pragathi, Saripudi
    Sri, Panguluri Bhavya
    SOFT COMPUTING FOR SECURITY APPLICATIONS, ICSCS 2022, 2023, 1428 : 701 - 712
  • [6] Predicting Phospholipidosis Using Machine Learning
    Lowe, Robert
    Glen, Robert C.
    Mitchell, John B. O.
    MOLECULAR PHARMACEUTICS, 2010, 7 (05) : 1708 - 1714
  • [7] Predicting pipeline burst pressures with machine learning models
    Phan, Hieu Chi
    Dhar, Ashutosh Sutra
    INTERNATIONAL JOURNAL OF PRESSURE VESSELS AND PIPING, 2021, 191
  • [8] App Success Classification Using Machine Learning Models
    Magar, Biplab Thapa
    Mali, Subin
    Abdelfattah, Eman
    2021 IEEE 11TH ANNUAL COMPUTING AND COMMUNICATION WORKSHOP AND CONFERENCE (CCWC), 2021, : 642 - 647
  • [9] Dangerous prediction in roads by using machine learning models
    Satla S.P.
    Sadanandam M.
    Suvarna B.
    Ingenierie des Systemes d'Information, 2020, 25 (05): : 637 - 644
  • [10] A Survey for Predicting Enzyme Family Classes Using Machine Learning Methods
    Tan, Jiu-Xin
    Lv, Hao
    Wang, Fang
    Dao, Fu-Ying
    Chen, Wei
    Ding, Hui
    CURRENT DRUG TARGETS, 2019, 20 (05) : 540 - 550