Public Awareness and Sentiment Analysis of COVID-Related Discussions Using BERT-Based Infoveillance

被引:6
作者
Xie, Tianyi [1 ]
Ge, Yaorong [1 ]
Xu, Qian [2 ]
Chen, Shi [3 ]
机构
[1] Univ N Carolina, Dept Software & Informat Syst, Charlotte, NC 28223 USA
[2] Elon Univ, Sch Commun, Elon, NC 27244 USA
[3] Univ N Carolina, Dept Publ Hlth Sci, Charlotte, NC 28223 USA
关键词
public awareness; sentiment analysis; social media analytics; infoveillance; natural language processing; SOCIAL MEDIA; INFLUENZA; EBOLA; WEB;
D O I
10.3390/ai4010016
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Understanding different aspects of public concerns and sentiments during large health emergencies, such as the COVID-19 pandemic, is essential for public health agencies to develop effective communication strategies, deliver up-to-date and accurate health information, and mitigate potential impacts of emerging misinformation. Current infoveillance systems generally focus on discussion intensity (i.e., number of relevant posts) as an approximation of public awareness, while largely ignoring the rich and diverse information in texts with granular information of varying public concerns and sentiments. In this study, we address this grand challenge by developing a novel natural language processing (NLP) infoveillance workflow based on bidirectional encoder representation from transformers (BERT). We first used a smaller COVID-19 tweet sample to develop a content classification and sentiment analysis model using COVID-Twitter-BERT. The classification accuracy was between 0.77 and 0.88 across the five identified topics. In the sentiment analysis with a three-class classification task (positive/negative/neutral), BERT achieved decent accuracy, 0.7. We then applied the content topic and sentiment classifiers to a much larger dataset with more than 4 million tweets in a 15-month period. We specifically analyzed non-pharmaceutical intervention (NPI) and social issue content topics. There were significant differences in terms of public awareness and sentiment towards the overall COVID-19, NPI, and social issue content topics across time and space. In addition, key events were also identified to associate with abrupt sentiment changes towards NPIs and social issues. This novel NLP-based AI workflow can be readily adopted for real-time granular content topic and sentiment infoveillance beyond the health context.
引用
收藏
页码:333 / 347
页数:15
相关论文
共 33 条
[1]   Top Concerns of Tweeters During the COVID-19 Pandemic: Infoveillance Study [J].
Abd-Alrazaq, Alaa ;
Alhuwail, Dari ;
Househ, Mowafa ;
Hamdi, Mounir ;
Shah, Zubair .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (04)
[2]  
[Anonymous], 2011, TWITTER CATCHES FLU
[3]   The Reliability of Tweets as a Supplementary Method of Seasonal Influenza Surveillance [J].
Aslam, Anoshe A. ;
Tsou, Ming-Hsiang ;
Spitzberg, Brian H. ;
An, Li ;
Gawron, J. Mark ;
Gupta, Dipak K. ;
Peddecord, K. Michael ;
Nagel, Anna C. ;
Allen, Christopher ;
Yang, Jiue-An ;
Lindsay, Suzanne .
JOURNAL OF MEDICAL INTERNET RESEARCH, 2014, 16 (11)
[4]   Latent Dirichlet allocation [J].
Blei, DM ;
Ng, AY ;
Jordan, MI .
JOURNAL OF MACHINE LEARNING RESEARCH, 2003, 3 (4-5) :993-1022
[5]   National and Local Influenza Surveillance through Twitter: An Analysis of the 2012-2013 Influenza Epidemic [J].
Broniatowski, David A. ;
Paul, Michael J. ;
Dredze, Mark .
PLOS ONE, 2013, 8 (12)
[6]   Text and Structural Data Mining of Influenza Mentions in Web and Social Media [J].
Corley, Courtney D. ;
Cook, Diane J. ;
Mikler, Armin R. ;
Singh, Karan P. .
INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH, 2010, 7 (02) :596-615
[7]   Impact assessment of non-pharmaceutical interventions against coronavirus disease 2019 and influenza in Hong Kong: an observational study [J].
Cowling, Benjamin J. ;
Ali, Sheikh Taslim ;
Ng, Tiffany W. Y. ;
Tsang, Tim K. ;
Li, Julian C. M. ;
Fong, Min Whui ;
Liao, Qiuyan ;
Kwan, Mike Y. W. ;
Lee, So Lun ;
Chiu, Susan S. ;
Wu, Joseph T. ;
Wu, Peng ;
Leung, Gabriel M. .
LANCET PUBLIC HEALTH, 2020, 5 (05) :E279-E288
[8]  
Culotta A., 2010, ARXIV, DOI [10.1145/1964858.1964874, DOI 10.1145/1964858.1964874]
[9]  
Devlin J, 2019, Arxiv, DOI [arXiv:1810.04805, 10.48550/arXiv.1810.04805]
[10]   Analysing the Combined Health, Social and Economic Impacts of the Corovanvirus Pandemic Using Agent-Based Social Simulation [J].
Dignum, Frank ;
Dignum, Virginia ;
Davidsson, Paul ;
Ghorbani, Amineh ;
van der Hurk, Mijke ;
Jensen, Maarten ;
Kammler, Christian ;
Lorig, Fabian ;
Ludescher, Luis Gustavo ;
Melchior, Alexander ;
Mellema, Rene ;
Pastrav, Cezara ;
Vanhee, Lois ;
Verhagen, Harko .
MINDS AND MACHINES, 2020, 30 (02) :177-194