Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model

被引：0

作者：

Wilson Lau

Kevin Lybarger

Martin L. Gunn

Meliha Yetisgen

机构：

[1] University of Washington,Biomedical & Health Informatics, School of Medicine

[2] University of Washington,Department of Radiology, School of Medicine

来源：

Journal of Digital Imaging | 2023年 / 36卷

关键词：

Natural language processing; Information extraction; Event extraction; Deep learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging (“lesions”) and other types of clinical problems (“medical problems”). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9–93.4% F1 for finding triggers and 72.0–85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1–89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.

引用

页码：91 / 104

页数：13

共 50 条

[31] LPBERT: A Protein-Protein Interaction Prediction Method Based on a Pre-Trained Language Model
Hu, An
Kuang, Linai
Yang, Dinghai
APPLIED SCIENCES-BASEL, 2025, 15 (06):
[32] Pre-Trained Language Model-Based Deep Learning for Sentiment Classification of Vietnamese Feedback
Loc, Cu Vinh
Viet, Truong Xuan
Viet, Tran Hoang
Thao, Le Hoang
Viet, Nguyen Hoang
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE AND APPLICATIONS, 2023, 22 (03)
[33] Migratable urban street scene sensing method based on vision language pre-trained model
Zhang, Yan
Zhang, Fan
Chen, Nengcheng
INTERNATIONAL JOURNAL OF APPLIED EARTH OBSERVATION AND GEOINFORMATION, 2022, 113
[34] A Hybrid Engine for Clinical Information Extraction from Radiology Reports
Gupta, Er Khushbu
Thammasudjarit, Ratchainant
Thakkinstian, Anunarin
2019 16TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER SCIENCE AND SOFTWARE ENGINEERING (JCSSE 2019), 2019, : 293 - 297
[35] CAM-BERT: Chinese Aerospace Manufacturing Pre-trained Language Model
Dai, Jinchi
Wang, Shengren
Wang, Peiyan
Li, Ruiting
Chen, Jiaxin
Li, Xinrong
2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 361 - 365
[36] Vision Enhanced Generative Pre-trained Language Model for Multimodal Sentence Summarization
Jing, Liqiang
Li, Yiren
Xu, Junhao
Yu, Yongcan
Shen, Pei
Song, Xuemeng
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (02) : 289 - 298
[37] Large language model-based information extraction from free-text radiology reports: a scoping review protocol
Reichenpfader, Daniel
Muller, Henning
Denecke, Kerstin
BMJ OPEN, 2023, 13 (12):
[38] Evaluating the performance of Generative Pre-trained Transformer-4 (GPT-4) in standardizing radiology reports
Hasani, Amir M.
Singh, Shiva
Zahergivar, Aryan
Ryan, Beth
Nethala, Daniel
Bravomontenegro, Gabriela
Mendhiratta, Neil
Ball, Mark
Farhadi, Faraz
Malayeri, Ashkan
EUROPEAN RADIOLOGY, 2024, 34 (06) : 3566 - 3574
[39] Joint Pre-Trained Chinese Named Entity Recognition Based on Bi-Directional Language Model
Ma, Changxia
Zhang, Chen
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2021, 35 (09)
[40] PLPMpro: Enhancing promoter sequence prediction with prompt-learning based pre-trained language model
Li, Zhongshen
Jin, Junru
Long, Wentao
Wei, Leyi
COMPUTERS IN BIOLOGY AND MEDICINE, 2023, 164

← 1 2 3 4 5 →