Categorizing Vaccine Confidence With a Transformer-Based Machine Learning Model: Analysis of Nuances of Vaccine Sentiment in Twitter Discourse

被引:15
作者
Kummervold, Per E. [1 ]
Martin, Sam [2 ,3 ,4 ,5 ]
Dada, Sara [4 ,6 ]
Kilich, Eliz [4 ]
Denny, Chermain [7 ]
Paterson, Pauline [4 ,8 ]
Larson, Heidi J. [4 ,8 ,9 ,10 ]
机构
[1] FISABIO Publ Hlth, Vaccine Res Dept, Avda Catalunya 21, Valencia 46020, Spain
[2] Univ Oxford, Ctr Clin Vaccinol & Trop Med, Oxford, England
[3] UCL, Dept Targeted Intervent, Rapid Res Evaluat & Appraisal Lab, London, England
[4] London Sch Hyg & Trop Med, Fac Epidemiol & Populat Hlth, London, England
[5] Univ Oxford, Ethox Ctr, Big Data Inst, Nuffield Dept Populat Hlth, Oxford, England
[6] Univ Coll Dublin, UCD Ctr Interdisciplinary Res Educ & Innovat Hlth, Sch Nursing Midwifery & Hlth Syst, Dublin, Ireland
[7] Vrije Univ Amsterdam, Fac Sci, Amsterdam, Netherlands
[8] NIHR Hlth Protect Res Unit, London, England
[9] Univ Washington, Inst Hlth Metr & Evaluat, Seattle, WA 98195 USA
[10] Royal Inst Int Affairs, Chatham House Ctr Global Hlth Secur, London, England
关键词
computer science; information technology; public health; health humanities; vaccines; machine learning;
D O I
10.2196/29584
中图分类号
R-058 [];
学科分类号
摘要
Background: Social media has become an established platform for individuals to discuss and debate various subjects, including vaccination. With growing conversations on the web and less than desired maternal vaccination uptake rates, these conversations could provide useful insights to inform future interventions. However, owing to the volume of web-based posts, manual annotation and analysis are difficult and time consuming. Automated processes for this type of analysis, such as natural language processing, have faced challenges in extracting complex stances such as attitudes toward vaccination from large amounts of text. Objective: The aim of this study is to build upon recent advances in transposer-based machine learning methods and test whether transformer-based machine learning could be used as a tool to assess the stance expressed in social media posts toward vaccination during pregnancy. Methods: A total of 16,604 tweets posted between November 1, 2018, and April 30, 2019, were selected using keyword searches related to maternal vaccination. After excluding irrelevant tweets, the remaining tweets were coded by 3 individual researchers into the categories Promotional, Discouraging, Ambiguous, and Neutral or No Stance. After creating a final data set of 2722 unique tweets, multiple machine learning techniques were trained on a part of this data set and then tested and compared with the human annotators. Results: We found the accuracy of the machine learning techniques to be 81.8% (F score=0.78) compared with the agreed score among the 3 annotators. For comparison, the accuracies of the individual annotators compared with the final score were 83.3%, 77.9%, and 77.5%. Conclusions: This study demonstrates that we are able to achieve close to the same accuracy in categorizing tweets using our machine learning models as could be expected from a single human coder. The potential to use this automated process, which is reliable and accurate, could free valuable time and resources for conducting this analysis, in addition to informing potentially effective and necessary interventions. (JMIR Med Inform 2021;9(10):e29584) doi: 10.2196/29584
引用
收藏
页数:10
相关论文
共 18 条
[1]  
[Anonymous], MELTWATER HOMEPAGE
[2]  
[Anonymous], 2020, TENSORFLOW CODE PRE
[3]  
[Anonymous], SQUAD20 STANFORD QUE
[4]   Weaponized Health Communication: Twitter Bots and Russian Trolls Amplify the Vaccine Debate [J].
Broniatowski, David A. ;
Jamison, Amelia M. ;
Qi, SiHua ;
AlKulaib, Lulwah ;
Chen, Tao ;
Benton, Adrian ;
Quinn, Sandra C. ;
Dredze, Mark .
AMERICAN JOURNAL OF PUBLIC HEALTH, 2018, 108 (10) :1378-1384
[5]  
Chollet F., 2018, Deep Learning with Python, DOI DOI 10.1007/978-1-4842-2766-4
[6]  
Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
[7]  
Dhande L, 2014, INT J EMERG TRENDS T, V3, P1
[8]  
He P, DEBERTA DECODING ENH
[9]   Semantic network analysis of vaccine sentiment in online social media [J].
Kang, Gloria J. ;
Ewing-Nelson, Sinclair R. ;
Mackey, Lauren ;
Schlitt, James T. ;
Marathe, Achla ;
Abbas, Kaja M. ;
Swarup, Samarth .
VACCINE, 2017, 35 (29) :3621-3638
[10]  
Lan Z, ALBERT LITE BERT SEL