Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model

被引:0
|
作者
Wilson Lau
Kevin Lybarger
Martin L. Gunn
Meliha Yetisgen
机构
[1] University of Washington,Biomedical & Health Informatics, School of Medicine
[2] University of Washington,Department of Radiology, School of Medicine
来源
Journal of Digital Imaging | 2023年 / 36卷
关键词
Natural language processing; Information extraction; Event extraction; Deep learning;
D O I
暂无
中图分类号
学科分类号
摘要
Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging (“lesions”) and other types of clinical problems (“medical problems”). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9–93.4% F1 for finding triggers and 72.0–85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1–89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.
引用
收藏
页码:91 / 104
页数:13
相关论文
共 50 条
  • [41] Event Monitoring and Intelligence Gathering Using Twitter Based Real-Time Event Summarization and Pre-Trained Model Techniques
    Lee, Chung-Hong
    Yang, Hsin-Chang
    Chen, Yenming J.
    Chuang, Yung-Lin
    APPLIED SCIENCES-BASEL, 2021, 11 (22):
  • [42] A pre-trained language model for emergency department intervention prediction using routine physiological data and clinical narratives
    Huang, Ting-Yun
    Chong, Chee-Fah
    Lin, Heng-Yu
    Chen, Tzu-Ying
    Chang, Yung-Chun
    Lin, Ming-Chin
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2024, 191
  • [43] Recent Progress on Named Entity Recognition Based on Pre-trained Language Models
    Yang, Binxia
    Luo, Xudong
    2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 799 - 804
  • [44] Comprehensive Research on Druggable Proteins: From PSSM to Pre-Trained Language Models
    Chu, Hongkang
    Liu, Taigang
    INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (08)
  • [45] CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain
    Lange, Lukas
    Adel, Heike
    Stroetgen, Jannik
    Klakow, Dietrich
    BIOINFORMATICS, 2022, 38 (12) : 3267 - 3274
  • [46] CANCN-BERT: A Joint Pre-Trained Language Model for Classical and Modern Chinese
    Ji, Zijing
    Wang, Xin
    Shen, Yuxin
    Rao, Guozheng
    PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3112 - 3116
  • [47] miProBERT: identification of microRNA promoters based on the pre-trained model BERT
    Wang, Xin
    Gao, Xin
    Wang, Guohua
    Li, Dan
    BRIEFINGS IN BIOINFORMATICS, 2023, 24 (03)
  • [48] ClassifAI: Automating Issue Reports Classification using Pre-Trained BERT (Bidirectional Encoder Representations from Transformers) Language Models
    Alam, Khubaib Amjad
    Jumani, Ashish
    Aamir, Harris
    Uzair, Muhammad
    PROCEEDINGS 2024 ACM/IEEE INTERNATIONAL WORKSHOP ON NL-BASED SOFTWARE ENGINEERING, NLBSE 2024, 2024, : 49 - 52
  • [49] Biomedical generative pre-trained based transformer language model for age-related disease target discovery
    Zagirova, Diana
    Pushkov, Stefan
    Leung, Geoffrey Ho Duen
    Liu, Bonnie Hei Man
    Urban, Anatoly
    Sidorenko, Denis
    Kalashnikov, Aleksandr
    Kozlova, Ekaterina
    Naumov, Vladimir
    Pun, Frank W.
    Ozerov, Ivan V.
    Aliper, Alex
    Zhavoronkov, Alex
    AGING-US, 2023, 15 (18): : 9293 - 9309
  • [50] Drug-BERT : Pre-trained Language Model Specialized for Korean Drug Crime
    Lee, Jeong Min
    Lee, Suyeon
    Byon, Sungwon
    Jung, Eui-Suk
    Baek, Myung-Sun
    19TH IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING, BMSB 2024, 2024, : 186 - 188