Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model

被引:0
|
作者
Wilson Lau
Kevin Lybarger
Martin L. Gunn
Meliha Yetisgen
机构
[1] University of Washington,Biomedical & Health Informatics, School of Medicine
[2] University of Washington,Department of Radiology, School of Medicine
来源
Journal of Digital Imaging | 2023年 / 36卷
关键词
Natural language processing; Information extraction; Event extraction; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging (“lesions”) and other types of clinical problems (“medical problems”). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9–93.4% F1 for finding triggers and 72.0–85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1–89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.
引用
收藏
页码:91 / 104
页数:13
相关论文
共 50 条
  • [1] Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model
    Lau, Wilson
    Lybarger, Kevin
    Gunn, Martin L.
    Yetisgen, Meliha
    JOURNAL OF DIGITAL IMAGING, 2023, 36 (01) : 91 - 104
  • [2] Event Evolution Analysis of Network Text Based on Pre-trained Language Model and Event Graph
    Yang, Jinshun
    Huang, Shuangxi
    Huang, Mingfeng
    COOPERATIVE DESIGN, VISUALIZATION, AND ENGINEERING, CDVE 2024, 2024, 15158 : 52 - 62
  • [3] A survey of text classification based on pre-trained language model
    Wu, Yujia
    Wan, Jun
    NEUROCOMPUTING, 2025, 616
  • [4] A Pre-trained Clinical Language Model for Acute Kidney Injury
    Mao, Chengsheng
    Yao, Liang
    Luo, Yuan
    2020 8TH IEEE INTERNATIONAL CONFERENCE ON HEALTHCARE INFORMATICS (ICHI 2020), 2020, : 531 - 532
  • [5] Detection of Chinese Deceptive Reviews Based on Pre-Trained Language Model
    Weng, Chia-Hsien
    Lin, Kuan-Cheng
    Ying, Jia-Ching
    APPLIED SCIENCES-BASEL, 2022, 12 (07):
  • [6] A Protein-Protein Interaction Extraction Approach Based on Large Pre-trained Language Model and Adversarial Training
    Tang, Zhan
    Guo, Xuchao
    Bai, Zhao
    Diao, Lei
    Lu, Shuhan
    Li, Lin
    KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2022, 16 (03): : 771 - 791
  • [7] Automatic extraction of 12 cardiovascular concepts from German discharge letters using pre-trained language models
    Richter-Pechanski, Phillip
    Geis, Nicolas A.
    Kiriakou, Christina
    Schwab, Dominic M.
    Dieterich, Christoph
    DIGITAL HEALTH, 2021, 7
  • [8] Surgicberta: a pre-trained language model for procedural surgical language
    Bombieri, Marco
    Rospocher, Marco
    Ponzetto, Simone Paolo
    Fiorini, Paolo
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 18 (01) : 69 - 81
  • [9] A teacher action recognition model based on pre-trained language and video model
    Luo, Sen
    Zhou, Juxiang
    Wen, Xiaoyu
    Li, Hao
    PROCEEDINGS OF THE 15TH INTERNATIONAL CONFERENCE ON EDUCATION TECHNOLOGY AND COMPUTERS, ICETC 2023, 2023, : 335 - 340
  • [10] LMRank: Utilizing Pre-Trained Language Models and Dependency Parsing for Keyphrase Extraction
    Giarelis, Nikolaos
    Karacapilidis, Nikos
    IEEE ACCESS, 2023, 11 : 71459 - 71471