Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach

被引:7
作者
Raza, Shaina [1 ,2 ]
Schwartz, Brian [1 ,2 ]
机构
[1] Publ Hlth Ontario PHO, Toronto, ON, Canada
[2] Univ Toronto, Dalla Lana Sch Publ Hlth, Toronto, ON, Canada
关键词
Natural language processing; Data cohort; COVID-19; Named entity; Relation extraction; Transfer learning; Artificial intelligence; RECOGNITION;
D O I
10.1186/s12911-023-02117-3
中图分类号
R-058 [];
学科分类号
摘要
BackgroundExtracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data.ObjectiveThis study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature.MethodsThe proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports.ResultsThe named entity recognition implementation in the NLP layer achieves a performance gain of about 1-3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extraction method outperforms benchmark methods in terms of accuracy (by 1-8% better). A thorough examination reveals the disease's presence and symptoms prevalence in patients.ConclusionsA similar approach can be generalized to other infectious diseases. It is worthwhile to use prior knowledge acquired through transfer learning when researching other infectious diseases.
引用
收藏
页数:17
相关论文
共 50 条
  • [41] Are synthetic clinical notes useful for real natural language processing tasks: A case study on clinical entity recognition
    Li, Jianfu
    Zhou, Yujia
    Jiang, Xiaoqian
    Natarajan, Karthik
    Pakhomov, Serguei Vs
    Liu, Hongfang
    Xu, Hua
    [J]. JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2021, 28 (10) : 2193 - 2201
  • [42] Leveraging Natural Language Processing to Mine Issues on Twitter During the COVID-19 Pandemic
    Agarwal, Ankita
    Salehundam, Preetham
    Padhee, Swati
    Romine, William L.
    Banerjee, Tanvi
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 886 - 891
  • [43] The impact of the learning shift during COVID-19 on students using natural language processing
    Shaiba, Hadil
    John, Maya
    [J]. INTERNATIONAL JOURNAL OF TECHNOLOGY ENHANCED LEARNING, 2023, 15 (02) : 195 - 214
  • [44] COVID-Twitter-BERT: A natural language processing model to analyse COVID-19 content on Twitter
    Müller, Martin
    Salathe, Marcel
    Kummervold, Per E.
    [J]. FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
  • [45] Using Local Grammar for Entity Extraction from Clinical Reports
    Ghoulam, Aicha
    Barigou, Fatiha
    Belalem, Ghalem
    Meziane, Farid
    [J]. INTERNATIONAL JOURNAL OF INTERACTIVE MULTIMEDIA AND ARTIFICIAL INTELLIGENCE, 2015, 3 (03): : 16 - 24
  • [46] Monitoring COVID-19 on Social Media: Development of an End-to-End Natural Language Processing Pipeline Using a Novel Triage and Diagnosis Approach
    Hasan, Abul
    Levene, Mark
    Weston, David
    Fromson, Renate
    Koslover, Nicolas
    Levene, Tamara
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2022, 24 (02)
  • [47] Monitoring COVID-19 pandemic through the lens of social media using natural language processing and machine learning
    Liu, Yang
    Whitfield, Christopher
    Zhang, Tianyang
    Hauser, Amanda
    Reynolds, Taeyonn
    Anwar, Mohd
    [J]. HEALTH INFORMATION SCIENCE AND SYSTEMS, 2021, 9 (01)
  • [48] Monitoring COVID-19 pandemic through the lens of social media using natural language processing and machine learning
    Yang Liu
    Christopher Whitfield
    Tianyang Zhang
    Amanda Hauser
    Taeyonn Reynolds
    Mohd Anwar
    [J]. Health Information Science and Systems, 9
  • [49] Partisan Differences in Legislators' Discussion of Vaccination on Twitter During the COVID-19 Era: Natural Language Processing Analysis
    Engel-Rebitzer, Eden
    Stokes, Daniel C.
    Meisel, Zachary F.
    Purtle, Jonathan
    Doyle, Rebecca
    Buttenheim, Alison M.
    [J]. JMIR INFODEMIOLOGY, 2022, 2 (01):
  • [50] Distilling Business Value from COVID-19 Public Media Dataset with Machine Learning and Natural Language Processing
    Etheridge, Tracey
    Lu, Guang
    Lipenkova, Janna
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON KNOWLEDGE GRAPH (ICKG), 2022, : 56 - 63