Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model

被引：0

作者：

Wilson Lau

Kevin Lybarger

Martin L. Gunn

Meliha Yetisgen

机构：

[1] University of Washington,Biomedical & Health Informatics, School of Medicine

[2] University of Washington,Department of Radiology, School of Medicine

来源：

Journal of Digital Imaging | 2023年 / 36卷

关键词：

Natural language processing; Information extraction; Event extraction; Deep learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging (“lesions”) and other types of clinical problems (“medical problems”). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9–93.4% F1 for finding triggers and 72.0–85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1–89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.

引用

页码：91 / 104

页数：13

共 50 条

[41] Event Monitoring and Intelligence Gathering Using Twitter Based Real-Time Event Summarization and Pre-Trained Model Techniques
Lee, Chung-Hong
Yang, Hsin-Chang
Chen, Yenming J.
Chuang, Yung-Lin
APPLIED SCIENCES-BASEL, 2021, 11 (22):
[42] A pre-trained language model for emergency department intervention prediction using routine physiological data and clinical narratives
Huang, Ting-Yun
Chong, Chee-Fah
Lin, Heng-Yu
Chen, Tzu-Ying
Chang, Yung-Chun
Lin, Ming-Chin
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2024, 191
[43] Recent Progress on Named Entity Recognition Based on Pre-trained Language Models
Yang, Binxia
Luo, Xudong
2023 IEEE 35TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE, ICTAI, 2023, : 799 - 804
[44] Comprehensive Research on Druggable Proteins: From PSSM to Pre-Trained Language Models
Chu, Hongkang
Liu, Taigang
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, 2024, 25 (08)
[45] CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain
Lange, Lukas
Adel, Heike
Stroetgen, Jannik
Klakow, Dietrich
BIOINFORMATICS, 2022, 38 (12) : 3267 - 3274
[46] CANCN-BERT: A Joint Pre-Trained Language Model for Classical and Modern Chinese
Ji, Zijing
Wang, Xin
Shen, Yuxin
Rao, Guozheng
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, : 3112 - 3116
[47] miProBERT: identification of microRNA promoters based on the pre-trained model BERT
Wang, Xin
Gao, Xin
Wang, Guohua
Li, Dan
BRIEFINGS IN BIOINFORMATICS, 2023, 24 (03)
[48] ClassifAI: Automating Issue Reports Classification using Pre-Trained BERT (Bidirectional Encoder Representations from Transformers) Language Models
Alam, Khubaib Amjad
Jumani, Ashish
Aamir, Harris
Uzair, Muhammad
PROCEEDINGS 2024 ACM/IEEE INTERNATIONAL WORKSHOP ON NL-BASED SOFTWARE ENGINEERING, NLBSE 2024, 2024, : 49 - 52
[49] Biomedical generative pre-trained based transformer language model for age-related disease target discovery
Zagirova, Diana
Pushkov, Stefan
Leung, Geoffrey Ho Duen
Liu, Bonnie Hei Man
Urban, Anatoly
Sidorenko, Denis
Kalashnikov, Aleksandr
Kozlova, Ekaterina
Naumov, Vladimir
Pun, Frank W.
Ozerov, Ivan V.
Aliper, Alex
Zhavoronkov, Alex
AGING-US, 2023, 15 (18): : 9293 - 9309
[50] Drug-BERT : Pre-trained Language Model Specialized for Korean Drug Crime
Lee, Jeong Min
Lee, Suyeon
Byon, Sungwon
Jung, Eui-Suk
Baek, Myung-Sun
19TH IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING, BMSB 2024, 2024, : 186 - 188

← 1 2 3 4 5 →