Event-Based Clinical Finding Extraction from Radiology Reports with Pre-trained Language Model

被引：0

作者：

Wilson Lau

Kevin Lybarger

Martin L. Gunn

Meliha Yetisgen

机构：

[1] University of Washington,Biomedical & Health Informatics, School of Medicine

[2] University of Washington,Department of Radiology, School of Medicine

来源：

Journal of Digital Imaging | 2023年 / 36卷

关键词：

Natural language processing; Information extraction; Event extraction; Deep learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Radiology reports contain a diverse and rich set of clinical abnormalities documented by radiologists during their interpretation of the images. Comprehensive semantic representations of radiological findings would enable a wide range of secondary use applications to support diagnosis, triage, outcomes prediction, and clinical research. In this paper, we present a new corpus of radiology reports annotated with clinical findings. Our annotation schema captures detailed representations of pathologic findings that are observable on imaging (“lesions”) and other types of clinical problems (“medical problems”). The schema used an event-based representation to capture fine-grained details, including assertion, anatomy, characteristics, size, and count. Our gold standard corpus contained a total of 500 annotated computed tomography (CT) reports. We extracted triggers and argument entities using two state-of-the-art deep learning architectures, including BERT. We then predicted the linkages between trigger and argument entities (referred to as argument roles) using a BERT-based relation extraction model. We achieved the best extraction performance using a BERT model pre-trained on 3 million radiology reports from our institution: 90.9–93.4% F1 for finding triggers and 72.0–85.6% F1 for argument roles. To assess model generalizability, we used an external validation set randomly sampled from the MIMIC Chest X-ray (MIMIC-CXR) database. The extraction performance on this validation set was 95.6% for finding triggers and 79.1–89.7% for argument roles, demonstrating that the model generalized well to the cross-institutional data with a different imaging modality. We extracted the finding events from all the radiology reports in the MIMIC-CXR database and provided the extractions to the research community.

引用

页码：91 / 104

页数：13

共 50 条

[21] Talent Supply and Demand Matching Based on Prompt Learning and the Pre-Trained Language Model
Li, Kunping
Liu, Jianhua
Zhuang, Cunbo
APPLIED SCIENCES-BASEL, 2025, 15 (05):
[22] An analysis on language transfer of pre-trained language model with cross-lingual post-training
Son, Suhyune
Park, Chanjun
Lee, Jungseob
Shim, Midan
Lee, Chanhee
Jang, Yoonna
Seo, Jaehyung
Lim, Jungwoo
Lim, Heuiseok
EXPERT SYSTEMS WITH APPLICATIONS, 2025, 267
[23] Pre-trained Model Based Feature Envy Detection
Ma, Wenhao
Yu, Yaoxiang
Ruan, Xiaoming
Cai, Bo
2023 IEEE/ACM 20TH INTERNATIONAL CONFERENCE ON MINING SOFTWARE REPOSITORIES, MSR, 2023, : 430 - 440
[24] Automatic Fixation of Decompilation Quirks Using Pre-trained Language Model
Kaichi, Ryunosuke
Matsumoto, Shinsuke
Kusumoto, Shinji
PRODUCT-FOCUSED SOFTWARE PROCESS IMPROVEMENT, PROFES 2023, PT I, 2024, 14483 : 259 - 266
[25] Question-answering Forestry Pre-trained Language Model: ForestBERT
Tan, Jingwei
Zhang, Huaiqing
Liu, Yang
Yang, Jie
Zheng, Dongping
Linye Kexue/Scientia Silvae Sinicae, 2024, 60 (09): : 99 - 110
[26] Measuring semantic similarity of clinical trial outcomes using deep pre-trained language representations
Koroleva A.
Kamath S.
Paroubek P.
Journal of Biomedical Informatics: X, 2019, 4
[27] iProL: identifying DNA promoters from sequence information based on Longformer pre-trained model
Peng, Binchao
Sun, Guicong
Fan, Yongxian
BMC BIOINFORMATICS, 2024, 25 (01):
[28] Few-shot medical relation extraction via prompt tuning enhanced pre-trained language model
He, Guoxiu
Huang, Chen
NEUROCOMPUTING, 2025, 633
[29] A Transformer Based Approach To Detect Suicidal Ideation Using Pre-Trained Language Models
Haque, Farsheed
Nur, Ragib Un
Al Jahan, Shaeekh
Mahmud, Zarar
Shah, Faisal Muhammad
2020 23RD INTERNATIONAL CONFERENCE ON COMPUTER AND INFORMATION TECHNOLOGY (ICCIT 2020), 2020,
[30] Predictive Recognition of DNA-binding Proteins Based on Pre-trained Language Model BERT
Ma, Yue
Pei, Yongzhen
Li, Changguo
JOURNAL OF BIOINFORMATICS AND COMPUTATIONAL BIOLOGY, 2023, 21 (06)

← 1 2 3 4 5 →