Machine Learning-Based Identifications of COVID-19 Fake News Using Biomedical Information Extraction

被引:4
作者
Fifita, Faizi [1 ]
Smith, Jordan [2 ]
Hanzsek-Brill, Melissa B. [2 ]
Li, Xiaoyin [2 ]
Zhou, Mengshi [2 ]
机构
[1] St Cloud State Univ, Dept Comp Sci & Informat Technol, 720 4th Ave South, St Cloud, MN 56301 USA
[2] St Cloud State Univ, Dept Math & Stat, 720 4th Ave South, St Cloud, MN 56301 USA
基金
美国国家科学基金会;
关键词
COVID-19; fake news; public health infodemic; machine learning; biomedical information extraction; SYSTEM; UMLS;
D O I
10.3390/bdcc7010046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The spread of fake news related to COVID-19 is an infodemic that leads to a public health crisis. Therefore, detecting fake news is crucial for an effective management of the COVID-19 pandemic response. Studies have shown that machine learning models can detect COVID-19 fake news based on the content of news articles. However, the use of biomedical information, which is often featured in COVID-19 news, has not been explored in the development of these models. We present a novel approach for predicting COVID-19 fake news by leveraging biomedical information extraction (BioIE) in combination with machine learning models. We analyzed 1164 COVID-19 news articles and used advanced BioIE algorithms to extract 158 novel features. These features were then used to train 15 machine learning classifiers to predict COVID-19 fake news. Among the 15 classifiers, the random forest model achieved the best performance with an area under the ROC curve (AUC) of 0.882, which is 12.36% to 31.05% higher compared to models trained on traditional features. Furthermore, incorporating BioIE-based features improved the performance of a state-of-the-art multi-modality model (AUC 0.914 vs. 0.887). Our study suggests that incorporating biomedical information into fake news detection models improves their performance, and thus could be a valuable tool in the fight against the COVID-19 infodemic.
引用
收藏
页数:18
相关论文
共 78 条
  • [1] CoAID-DEEP: An Optimized Intelligent Framework for Automated Detecting COVID-19 Misleading Information on Twitter
    Abdelminaam, Diaa Salama
    Ismail, Fatma Helmy
    Taha, Mohamed
    Taha, Ahmed
    Houssein, Essam H.
    Nabil, Ayman
    [J]. IEEE ACCESS, 2021, 9 : 27840 - 27867
  • [2] Language-Independent Fake News Detection: English, Portuguese, and Spanish Mutual Features
    Abonizio, Hugo Queiroz
    de Morais, Janaina Ignacio
    Tavares, Gabriel Marques
    Barbon Junior, Sylvio
    [J]. FUTURE INTERNET, 2020, 12 (05):
  • [3] The Impact of Social Media on Panic During the COVID-19 Pandemic in Iraqi Kurdistan: Online Questionnaire Study
    Ahmad, Araz Ramazan
    Murad, Hersh Rasool
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2020, 22 (05)
  • [4] Lies Kill, Facts Save: Detecting COVID-19 Misinformation in Twitter
    Al-Rakhami, Mabrook S.
    Al-Amri, Atif M.
    [J]. IEEE ACCESS, 2020, 8 : 155961 - 155970
  • [5] Machine Learning in Detecting COVID-19 Misinformation on Twitter
    Alenezi, Mohammed N.
    Alqenaei, Zainab M.
    [J]. FUTURE INTERNET, 2021, 13 (10)
  • [6] Sentiment Analysis for Fake News Detection
    Alonso, Miguel A.
    Vilares, David
    Gomez-Rodriguez, Carlos
    Vilares, Jesus
    [J]. ELECTRONICS, 2021, 10 (11)
  • [7] An overview of MetaMap: historical perspective and recent advances
    Aronson, Alan R.
    Lang, Francois-Michel
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2010, 17 (03) : 229 - 236
  • [8] Aronson AR, 2001, J AM MED INFORM ASSN, P17
  • [9] Baarir N.F., 2021, P IEEE INT WORKSH HU, P125, DOI DOI 10.1109/IHSH51661.2021.9378748
  • [10] Concept annotation in the CRAFT corpus
    Bada, Michael
    Eckert, Miriam
    Evans, Donald
    Garcia, Kristin
    Shipley, Krista
    Sitnikov, Dmitry
    Baumgartner, William A., Jr.
    Cohen, K. Bretonnel
    Verspoor, Karin
    Blake, Judith A.
    Hunter, Lawrence E.
    [J]. BMC BIOINFORMATICS, 2012, 13