Identifying and preventing fraudulent responses in online public health surveys: Lessons learned during the COVID-19 pandemic

被引:34
作者
Wang, June [1 ]
Calderon, Gabriela [2 ]
Hager, Erin R. [3 ]
Edwards, Lorece, V [4 ]
Berry, Andrea A. [5 ]
Liu, Yisi [2 ]
Dinh, Janny [3 ]
Summers, August C. [6 ]
Connor, Katherine A. [2 ]
Collins, Megan E. [7 ]
Prichett, Laura [2 ]
Marshall, Beth R. [3 ]
Johnson, Sara B. [2 ,3 ]
机构
[1] Johns Hopkins Univ, Krieger Sch Arts & Sci, Baltimore, MD USA
[2] Johns Hopkins Sch Med, Dept Pediat, Baltimore, MD 21205 USA
[3] Johns Hopkins Bloomberg Sch Publ Hlth, Dept Populat Family & Reprod Hlth, Baltimore, MD 21205 USA
[4] Morgan State Univ, Sch Community Hlth & Policy, Baltimore, MD USA
[5] Univ Maryland, Sch Med, Dept Pediat, Baltimore, MD USA
[6] Johns Hopkins Ctr Commun Programs, Baltimore, MD USA
[7] Johns Hopkins Wilmer Eye Inst, Baltimore, MD USA
来源
PLOS GLOBAL PUBLIC HEALTH | 2023年 / 3卷 / 08期
关键词
D O I
10.1371/journal.pgph.0001452
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Web-based survey data collection has become increasingly popular, and limitations on in- person data collection during the COVID-19 pandemic have fueled this growth. However, the anonymity of the online environment increases the risk of fraudulent responses provided by bots or those who complete surveys to receive incentives, a major risk to data integrity. As part of a study of COVID-19 and the return to in-person school, we implemented a web- based survey of parents in Maryland between December 2021 and July 2022. Recruitment relied, in part, on social media advertisements. Despite implementing many existing best practices, we found the survey challenged by sophisticated fraudsters. In response, we iteratively improved survey security. In this paper, we describe efforts to identify and prevent fraudulent online survey responses. Informed by this experience, we provide specific, actionable recommendations for identifying and preventing online survey fraud in future research. Some strategies can be deployed within the data collection platform such as careful crafting of survey links, Internet Protocol address logging to identify duplicate responses, and comparison of client-side and server-side time stamps to identify responses that may have been completed by respondents outside of the survey's target geography. Other strategies can be implemented during the survey design phase. These approaches include the use of a 2-stage design in which respondents must be eligible on a preliminary screener before receiving a personalized link. Other design-based strategies include within-survey and cross-survey validation questions, the addition of "speed bump" questions to thwart careless or computerized responders, and the use of optional open-ended survey questions to identify fraudsters. We describe best practices for ongoing monitoring and post-completion survey data review and verification, including algorithms to expedite some aspects of data review and quality assurance. Such strategies are increasingly critical to safeguarding survey-based public health research.
引用
收藏
页数:10
相关论文
共 15 条
[1]   Social media as a recruitment platform for a nationwide online survey of COVID-19 knowledge, beliefs, and practices in the United States: methodology and feasibility analysis [J].
Ali, Shahmir H. ;
Foreman, Joshua ;
Capasso, Ariadna ;
Jones, Abbey M. ;
Tozan, Yesim ;
DiClemente, Ralph J. .
BMC MEDICAL RESEARCH METHODOLOGY, 2020, 20 (01)
[2]   Fraud Detection Protocol for Web-Based Research Among Men Who Have Sex With Men: Development and Descriptive Evaluation [J].
Ballard, April M. ;
Cardwell, Trey ;
Young, April M. .
JMIR PUBLIC HEALTH AND SURVEILLANCE, 2019, 5 (01) :80-89
[3]  
Das M, 2018, Social and Behavioral Research and the Internet: Advances in Applied Methods and Research Strategies
[4]   Out damn bot, out: Recruiting real people into substance use studies on the internet [J].
Godinho, Alexandra ;
Schell, Christina ;
Cunningham, John A. .
SUBSTANCE ABUSE, 2020, 41 (01) :3-5
[5]  
Göritz AS, 2006, INT J INTERNET SCI, V1, P58
[6]   Ensuring survey research data integrity in the era of internet bots [J].
Griffin M. ;
Martino R.J. ;
LoSchiavo C. ;
Comer-Carruthers C. ;
Krause K.D. ;
Stults C.B. ;
Halkitis P.N. .
Quality & Quantity, 2022, 56 (4) :2841-2852
[7]   Research electronic data capture (REDCap)-A metadata-driven methodology and workflow process for providing translational research informatics support [J].
Harris, Paul A. ;
Taylor, Robert ;
Thielke, Robert ;
Payne, Jonathon ;
Gonzalez, Nathaniel ;
Conde, Jose G. .
JOURNAL OF BIOMEDICAL INFORMATICS, 2009, 42 (02) :377-381
[8]   Digitizing clinical trials [J].
Inan, O. T. ;
Tenaerts, P. ;
Prindiville, S. A. ;
Reynolds, H. R. ;
Dizon, D. S. ;
Cooper-Arnold, K. ;
Turakhia, M. ;
Pletcher, M. J. ;
Preston, K. L. ;
Krumholz, H. M. ;
Marlin, B. M. ;
Mandl, K. D. ;
Klasnja, P. ;
Spring, B. ;
Iturriaga, E. ;
Campo, R. ;
Desvigne-Nickens, P. ;
Rosenberg, Y. ;
Steinhubl, S. R. ;
Califf, R. M. .
NPJ DIGITAL MEDICINE, 2020, 3 (01)
[9]  
Kayrouz R, 2016, Facebook as an effective recruitment strategy for mental health research of hard to reach populations, V4, DOI [10.1016/j.invent.2016.01.001, DOI 10.1016/J.INVENT.2016.01.001]
[10]  
Lenneville C, GitLab Internet