Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model

被引:0
|
作者
Wilson Lau
Kevin Lybarger
Martin L. Gunn
Meliha Yetisgen
机构
[1] University of Washington,Biomedical & Health Informatics, School of Medicine
[2] University of Washington,Department of Radiology, School of Medicine
来源
Journal of Digital Imaging | 2023年 / 36卷
关键词
Natural language processing; Information extraction; Event extraction; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging (“lesions”) and other types of clinical problems (“medical problems”). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9–93.4% F1 for finding triggers and 72.0–85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1–89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.
引用
收藏
页码:91 / 104
页数:13
相关论文
共 50 条
  • [31] LPBERT: A Protein-Protein Interaction Prediction Method Based on a Pre-Trained Language Model
    Hu, An
    Kuang, Linai
    Yang, Dinghai
    APPLIED SCIENCES-BASEL, 2025, 15 (06):
  • [32] Pre-Trained Language Model-Based Deep Learning for Sentiment Classification of Vietnamese Feedback
    Loc, Cu Vinh
    Viet, Truong Xuan
    Viet, Tran Hoang
    Thao, Le Hoang
    Viet, Nguyen Hoang
    INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2023, 22 (03)
  • [33] Migratable urban street scene sensing method based on vision language pre-trained model
    Zhang, Yan
    Zhang, Fan
    Chen, Nengcheng
    INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 113
  • [34] A Hybrid Engine for Clinical Information Extraction from Radiology Reports
    Gupta, Er Khushbu
    Thammasudjarit, Ratchainant
    Thakkinstian, Anunarin
    2019 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE 2019), 2019, : 293 - 297
  • [35] CAM-BERT: Chinese Aerospace Manufacturing Pre-trained Language Model
    Dai, Jinchi
    Wang, Shengren
    Wang, Peiyan
    Li, Ruiting
    Chen, Jiaxin
    Li, Xinrong
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 361 - 365
  • [36] Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization
    Jing, Liqiang
    Li, Yiren
    Xu, Junhao
    Yu, Yongcan
    Shen, Pei
    Song, Xuemeng
    MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 289 - 298
  • [37] Large language model-based information extraction from free-text radiology reports: a scoping review protocol
    Reichenpfader, Daniel
    Muller, Henning
    Denecke, Kerstin
    BMJ OPEN, 2023, 13 (12):
  • [38] Evaluating the performance of Generative Pre-trained Transformer-4 (GPT-4) in standardizing radiology reports
    Hasani, Amir M.
    Singh, Shiva
    Zahergivar, Aryan
    Ryan, Beth
    Nethala, Daniel
    Bravomontenegro, Gabriela
    Mendhiratta, Neil
    Ball, Mark
    Farhadi, Faraz
    Malayeri, Ashkan
    EUROPEAN RADIOLOGY, 2024, 34 (06) : 3566 - 3574
  • [39] Joint Pre-Trained Chinese Named Entity Recognition Based on Bi-Directional Language Model
    Ma, Changxia
    Zhang, Chen
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (09)
  • [40] PLPMpro: Enhancing promoter sequence prediction with prompt-learning based pre-trained language model
    Li, Zhongshen
    Jin, Junru
    Long, Wentao
    Wei, Leyi
    COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 164