Detecting science-based health disinformation: a stylometric machine learning approach

被引:1
作者
Williams, Jason A. [1 ]
Aleroud, Ahmed [1 ]
Zimmerman, Danielle [1 ]
机构
[1] Augusta Univ, Sch Comp & Cyber Sci, Augusta, GA 30192 USA
来源
JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE | 2023年 / 6卷 / 02期
关键词
Health disinformation; COVID-19; Machine learning; Science; Human behavior; MISINFORMATION; READABILITY;
D O I
10.1007/s42001-023-00213-y
中图分类号
O1 [数学]; C [社会科学总论];
学科分类号
03 ; 0303 ; 0701 ; 070101 ;
摘要
The COVID-19 pandemic showed that misleading scientific health information has become widespread and is challenging to counteract. Some of this disinformation comes from modification of medical research results. This paper investigates how humans create health disinformation through controlled changes of text from abstracts of peer-reviewed COVID-19 research papers. We also developed a machine learning model that used statement embeddings, readability, and text quality features to create datasets that contain falsified scientific statements. We then created machine learning classification models to identify statements containing disinformation. Our results reveal the importance of readability metrics and information quality features in identifying which statements were falsified. We show that text embeddings and semantic similarity do not yield a high detection rate of true/falsified statements compared to using information quality and readability features.
引用
收藏
页码:817 / 843
页数:27
相关论文
共 55 条
  • [1] COVID-19 convalescent plasma treatment of moderate and severe cases of SARS-CoV-2 infection: A multicenter interventional study
    Alsharidah, Sondas
    Ayed, Mariam
    Ameen, Reem M.
    Alhuraish, Fatma
    Rouheldeen, Najat A.
    Alshammari, Farah R.
    Embaireeg, Alia
    Almelahi, Mariam
    Adel, Maitham
    Dawoud, Mohammed E.
    Aljasmi, Mohammad A.
    Alshammari, Nashmi
    Alsaeedi, Abdullah
    Al-Adsani, Wasel
    Arian, Hussan
    Awad, Husain
    Alenezi, Humoud A.
    Alzafiri, Azeez
    Gouda, Enas F.
    Almehanna, Mohammad
    Alqahtani, Salem
    Alshammari, Abdulrahman
    Askar, Medhat Z.
    [J]. INTERNATIONAL JOURNAL OF INFECTIOUS DISEASES, 2021, 103 : 439 - 446
  • [2] ANDERSON J, 1983, J READING, V26, P490
  • [3] Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615
  • [4] Bhagat S., 2022, 28 AM C INF SYST MIN
  • [5] Examining users' news sharing behaviour on social media: role of perception of online civic engagement and dual social influences
    Bhagat, Sarbottam
    Kim, Dan J.
    [J]. BEHAVIOUR & INFORMATION TECHNOLOGY, 2023, 42 (08) : 1194 - 1215
  • [6] Digital media and misinformation: An outlook on multidisciplinary strategies against manipulation
    Caled, Danielle
    Silva, Mario J.
    [J]. JOURNAL OF COMPUTATIONAL SOCIAL SCIENCE, 2022, 5 (01): : 123 - 159
  • [7] Carson Thomas L., 2010, LYING DECEPTION THEO, DOI DOI 10.1093/ACPROF:OSO/9780199577415.001.0001
  • [8] Readability assessment of internet-based patient education materials related to endoscopic sinus surgery
    Cherla, Deepa V.
    Sanghvi, Saurin
    Choudhry, Osamah J.
    Liu, James K.
    Eloy, Jean Anderson
    [J]. LARYNGOSCOPE, 2012, 122 (08) : 1649 - 1654
  • [9] The narrative truth about scientific misinformation
    Dahlstrom, Michael F.
    [J]. PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 2021, 118 (15)
  • [10] Damodaran P., 2021, PARROT PARAPHRASER