Automated classification of brain MRI reports using fine-tuned large language models

被引:5
作者
Kanzawa, Jun [1 ]
Yasaka, Koichiro [1 ]
Fujita, Nana [1 ]
Fujiwara, Shin [1 ]
Abe, Osamu [1 ]
机构
[1] Univ Tokyo Hosp, Dept Radiol, Tokyo, Japan
关键词
Brain tumor; Magnetic resonance imaging; Natural language processing; Large language model;
D O I
10.1007/s00234-024-03427-7
中图分类号
R74 [神经病学与精神病学];
学科分类号
摘要
PurposeThis study aimed to investigate the efficacy of fine-tuned large language models (LLM) in classifying brain MRI reports into pretreatment, posttreatment, and nontumor cases.MethodsThis retrospective study included 759, 284, and 164 brain MRI reports for training, validation, and test dataset. Radiologists stratified the reports into three groups: nontumor (group 1), posttreatment tumor (group 2), and pretreatment tumor (group 3) cases. A pretrained Bidirectional Encoder Representations from Transformers Japanese model was fine-tuned using the training dataset and evaluated on the validation dataset. The model which demonstrated the highest accuracy on the validation dataset was selected as the final model. Two additional radiologists were involved in classifying reports in the test datasets for the three groups. The model's performance on test dataset was compared to that of two radiologists.ResultsThe fine-tuned LLM attained an overall accuracy of 0.970 (95% CI: 0.930-0.990). The model's sensitivity for group 1/2/3 was 1.000/0.864/0.978. The model's specificity for group1/2/3 was 0.991/0.993/0.958. No statistically significant differences were found in terms of accuracy, sensitivity, and specificity between the LLM and human readers (p >= 0.371). The LLM completed the classification task approximately 20-26-fold faster than the radiologists. The area under the receiver operating characteristic curve for discriminating groups 2 and 3 from group 1 was 0.994 (95% CI: 0.982-1.000) and for discriminating group 3 from groups 1 and 2 was 0.992 (95% CI: 0.982-1.000).ConclusionFine-tuned LLM demonstrated a comparable performance with radiologists in classifying brain MRI reports, while requiring substantially less time.
引用
收藏
页码:2177 / 2183
页数:7
相关论文
共 14 条
[1]   Chatbots and Large Language Models in Radiology: A Practical Primer for Clinical and Research Applications [J].
Bhayana, Rajesh .
RADIOLOGY, 2024, 310 (01)
[2]   MR Imaging of Neoplastic Central Nervous System Lesions: Review and Recommendations for Current Practice [J].
Essig, M. ;
Anzalone, N. ;
Combs, S. E. ;
Doerfler, A. ;
Lee, S. -K. ;
Picozzi, P. ;
Rovira, A. ;
Weller, M. ;
Law, M. .
AMERICAN JOURNAL OF NEURORADIOLOGY, 2012, 33 (05) :803-817
[3]  
Gertz Roman Johannes, 2023, Radiology, V307, pe230877, DOI 10.1148/radiol.230877
[4]   Assessment of Deep Natural Language Processing in Ascertaining Oncologic Outcomes From Radiology Reports [J].
Kehl, Kenneth L. ;
Elmarakeby, Haitham ;
Nishino, Mizuki ;
Van Allen, Eliezer M. ;
Lepisto, Eva M. ;
Hassett, Michael J. ;
Johnson, Bruce E. ;
Schrag, Deborah .
JAMA ONCOLOGY, 2019, 5 (10) :1421-1429
[5]   Feasibility of Differential Diagnosis Based on Imaging Patterns Using a Large Language Model [J].
Kottlors, Jonathan ;
Bratke, Grischa ;
Rauen, Philip ;
Kabbasch, Christoph ;
Persigehl, Thorsten ;
Schlamann, Marc ;
Lennartz, Simon .
RADIOLOGY, 2023, 308 (01)
[6]   An introduction to Deep Learning in Natural Language Processing: Models, techniques, and tools [J].
Lauriola, Ivano ;
Lavelli, Alberto ;
Aiolli, Fabio .
NEUROCOMPUTING, 2022, 470 :443-456
[7]   BioBERT: a pre-trained biomedical language representation model for biomedical text mining [J].
Lee, Jinhyuk ;
Yoon, Wonjin ;
Kim, Sungdong ;
Kim, Donghyeon ;
Kim, Sunkyu ;
So, Chan Ho ;
Kang, Jaewoo .
BIOINFORMATICS, 2020, 36 (04) :1234-1240
[8]   Global incidence of malignant brain and other central nervous system tumors by histology, 2003-2007 [J].
Leece, Rebecca ;
Xu, Jordan ;
Ostrom, Quinn T. ;
Chen, Yanwen ;
Kruchko, Carol ;
Barnholtz-Sloan, Jill S. .
NEURO-ONCOLOGY, 2017, 19 (11) :1553-1564
[9]   Automatic detection of actionable radiology reports using bidirectional encoder representations from transformers [J].
Nakamura, Yuta ;
Hanaoka, Shouhei ;
Nomura, Yukihiro ;
Nakao, Takahiro ;
Miki, Soichiro ;
Watadani, Takeyuki ;
Yoshikawa, Takeharu ;
Hayashi, Naoto ;
Abe, Osamu .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2021, 21 (01)
[10]   Preliminary assessment of automated radiology report generation with generative pre-trained transformers: comparing results to radiologist-generated reports [J].
Nakaura, Takeshi ;
Yoshida, Naofumi ;
Kobayashi, Naoki ;
Shiraishi, Kaori ;
Nagayama, Yasunori ;
Uetani, Hiroyuki ;
Kidoh, Masafumi ;
Hokamura, Masamichi ;
Funama, Yoshinori ;
Hirai, Toshinori .
JAPANESE JOURNAL OF RADIOLOGY, 2024, 42 (02) :190-200