Infoveillance of the Croatian Online Media During the COVID-19 Pandemic: One-Year Longitudinal Study Using Natural Language Processing

被引:11
作者
Beliga, Slobodan [1 ,2 ]
Martincic-Ipsic, Sanda [1 ,2 ]
Matesic, Mihaela [2 ,3 ]
Vuksanovic, Irena Petrijevcanin [2 ]
Mestrovic, Ana [1 ,2 ]
机构
[1] Univ Rijeka, Dept Informat, Radmile Matejcic 2, Rijeka 51000, Croatia
[2] Univ Rijeka, Ctr Artificial Lntelligence & Cybersecur, Rijeka, Croatia
[3] Univ Rijeka, Fac Humanities & Social Sci, Rijeka, Croatia
关键词
COVID-19; pandemic; online media; news coverage; infoveillance; infodemic; infodemiology; natural language processing; name entity recognition; longitudinal study;
D O I
10.2196/31540
中图分类号
R1 [预防医学、卫生学];
学科分类号
1004 ; 120402 ;
摘要
Background: Online media play an important role in public health emergencies and serve as essential communication platforms. Infoveillance of online media during the COVID-19 pandemic is an important step toward gaining a better understanding of crisis communication. Objective: The goal of this study was to perform a longitudinal analysis of the COVID-19-related content on online media based on natural language processing. Methods: We collected a data set of news articles published by Croatian online media during the first 13 months of the pandemic. First, we tested the correlations between the number of articles and the number of new daily COVID-19 cases. Second, we analyzed the content by extracting the most frequent terms and applied the Jaccard similarity coefficient. Third, we compared the occurrence of the pandemic-related terms during the two waves of the pandemic. Finally, we applied named entity recognition to extract the most frequent entities and tracked the dynamics of changes during the observation period. Results: The results showed no significant correlation between the number of articles and the number of new daily COVID-19 cases. Furthermore, there were high overlaps in the terminology used in all articles published during the pandemic with a slight shift in the pandemic-related terms between the first and the second waves. Finally, the findings indicate that the most influential entities have lower overlaps for the identified people and higher overlaps for locations and institutions. Conclusions: Our study shows that online media have a prompt response to the pandemic with a large number of COVID-19-related articles. There was a high overlap in the frequently used terms across the first 13 months, which may indicate the narrow focus of reporting in certain periods. However, the pandemic-related terminology is well-covered.
引用
收藏
页数:15
相关论文
共 40 条
[1]  
Almazan-Ruiz E, 2020, CANKAYA U J HUMANITI, V14, P1
[2]  
[Anonymous], 2020, AD HOC WHO TECHNICAL
[3]  
[Anonymous], Nontraditional, DOI DOI 10.2174/1389200218666170427113504
[4]  
Babic Karlo, 2021, 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO), P395, DOI 10.23919/MIPRO52101.2021.9596693
[5]  
Babic K, 2021, 6 INT C INF COMM TEC, DOI [10.1007/978-981-16-1781-2_35], DOI 10.1007/978-981-16-1781-2_35]
[6]   Characterisation of COVID-19-Related Tweets in the Croatian Language: Framework Based on the Cro-CoV-cseBERT Model [J].
Babic, Karlo ;
Petrovic, Milan ;
Beliga, Slobodan ;
Martincic-Ipsic, Sanda ;
Matesic, Mihaela ;
Mestrovic, Ana .
APPLIED SCIENCES-BASEL, 2021, 11 (21)
[7]  
Beliga S, 2022, COVID 19 PANDEMIC CH
[8]  
Beliga S, 2015, J INF ORGAN SCI, V39, P1
[9]   SARS wars: An examination of the quantity and construction of health information in the news media [J].
Berry, Tanya R. ;
Wharf-Higgins, Joan ;
Naylor, P. J. .
HEALTH COMMUNICATION, 2007, 21 (01) :35-44
[10]  
Bogovic Petar Kristijan, 2021, 2021 44th International Convention on Information, Communication and Electronic Technology (MIPRO), P1044, DOI 10.23919/MIPRO52101.2021.9597125