Improving radiology reporting accuracy: use of GPT-4 to reduce errors in reports

被引:0
作者
Mayes, Connor J. [1 ]
Reyes, Chloe [2 ]
Truman, Mia E. [3 ]
Dodoo, Christopher A. [3 ]
Adler, Cameron R. [2 ]
Banerjee, Imon [2 ]
Khandelwal, Ashish [4 ]
Alexander, Lauren F. [5 ]
Sheedy, Shannon P. [4 ]
Thompson, Cole P. [2 ]
Varner, Jacob A. [2 ]
Zulfiqar, Maria [2 ]
Tan, Nelly [2 ]
机构
[1] Mayo Clin, Coll Med & Sci, Phoenix, AZ USA
[2] Mayo Clin, Phoenix, AZ 85054 USA
[3] Mayo Clin Scottsdale, Scottsdale, AZ USA
[4] Mayo Clin Rochester, Rochester, MI USA
[5] Mayo Clin Jacksonville, Jacksonville, FL USA
关键词
Artificial intelligence; GPT-4; Radiology; Radiology reports;
D O I
10.1007/s00261-025-05079-4
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
PurposeRadiology reports are essential for communicating imaging findings to guide diagnosis and treatment. Although most radiology reports are accurate, errors can occur in the final reports due to high workloads, use of dictation software, and human error. Advanced artificial intelligence models, such as GPT-4, show potential as tools to improve report accuracy. This retrospective study evaluated how GPT-4 performed in detecting and correcting errors in finalized radiology reports in real-world settings for abdominopelvic computed tomography (CT) reports.MethodsWe evaluated finalized CT abdominopelvic reports from a tertiary health system by using GPT-4 with zero-shot learning techniques. Six radiologists each reviewed 100 of their finalized reports (randomly selected), evaluating GPT-4's suggested revisions for agreement, acceptance, and clinical impact. The radiologists' responses were compared by years in practice and sex.ResultsGPT-4 identified issues and suggested revisions for 91% of the 600 reports; most revisions addressed grammar (74%). The radiologists agreed with 27% of the revisions and accepted 23%. Most revisions were rated as having no (44%) or low (46%) clinical impact. Potential harm was rare (8%), with only 2 cases of potentially severe harm. Radiologists with less experience (<= 7 years of practice) were more likely to agree with the revisions suggested by GPT-4 than those with more experience (34% vs. 20%, P = .003) and accepted a greater percentage of the revisions (32% vs. 15%, P = .003).ConclusionsAlthough GPT-4 showed promise in identifying errors and improving the clarity of finalized radiology reports, most errors were categorized as minor, with no or low clinical impact. Collectively, the radiologists accepted 23% of the suggested revisions in their finalized reports. This study highlights the potential of GPT-4 as a prospective tool for radiology reporting, with further refinement needed for consistent use in clinical practice.
引用
收藏
页数:8
相关论文
共 21 条
[1]   Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study [J].
Adams, Lisa C. ;
Truhn, Daniel ;
Busch, Felix ;
Kader, Avan ;
Niehues, Stefan M. ;
Makowski, Marcus R. ;
Bressem, Keno K. .
RADIOLOGY, 2023, 307 (04)
[2]   Mandating Limits on Workload, Duty, and Speed in Radiology [J].
Alexander, Robert ;
Waite, Stephen ;
Bruno, Michael A. ;
Krupinski, Elizabeth A. ;
Berlin, Leonard ;
Macknik, Stephen ;
Martinez-Conde, Susana .
RADIOLOGY, 2022, 304 (02) :274-282
[3]   Patient and Provider Feedback for Radiology Reports: Implementation of a Quality Improvement Project in a Multi-Institutional Setting [J].
Bavadian, Niusha ;
Tan, Nelly ;
Pesch, Arthur J. ;
McMullen, Kaley ;
Haman, Mike ;
Chan, Francis ;
Volk, Michael L. ;
Jacobson, J. Paul ;
Krishnaraj, Arun .
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2021, 18 (10) :1430-1438
[4]   Large Language Models for Automated Synoptic Reports and Resectability Categorization in Pancreatic Cancer [J].
Bhayana, Rajesh ;
Nanda, Bipin ;
Dehkharghanian, Taher ;
Deng, Yangqing ;
Bhambra, Nishaant ;
Elias, Gavin ;
Datta, Daksh ;
Kambadakone, Avinash ;
Shwaartz, Chaya G. ;
Moulton, Carol-Anne ;
Henault, David ;
Gallinger, Steven ;
Krishna, Satheesh .
RADIOLOGY, 2024, 311 (03)
[5]   BI-RADS Category Assignments by GPT-3.5, GPT-4, and Google Bard: A Multilanguage Study [J].
Cozzi, Andrea ;
Pinker, Katja ;
Hidber, Andri ;
Zhang, Tianyu ;
Bonomo, Luca ;
Lo Gullo, Roberto ;
Christianson, Blake ;
Curti, Marco ;
Rizzo, Stefania ;
Del Grande, Filippo ;
Mann, Ritse M. ;
Schiaffino, Simone .
RADIOLOGY, 2024, 311 (01)
[6]   Enhanced PROcedural Information READability for Patient-Centered Care in Interventional Radiology With Large Language Models (PRO-READ IR) [J].
Elhakim, Tarig ;
Brea, Allison R. ;
Fidelis, Wilton ;
Paravastu, Sriram S. ;
Malavia, Mira ;
Omer, Mustafa ;
Mort, Ana ;
Ramasamy, Shakthi Kumaran ;
Tripathi, Satvik ;
Dezube, Michael ;
Smolinski-Zhao, Sara ;
Daye, Dania .
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2025, 22 (01) :84-97
[7]   Potential of GPT-4 for Detecting Errors in Radiology Reports: Implications for Reporting Accuracy [J].
Gertz, Roman Johannes ;
Dratsch, Thomas ;
Bunck, Alexander Christian ;
Lennartz, Simon ;
Iuga, Andra-Iza ;
Hellmich, Martin Gunnar ;
Persigehl, Thorsten ;
Pennig, Lenhard ;
Gietzen, Carsten Herbert ;
Fervers, Philipp ;
Maintz, David ;
Hahnfeldt, Robert ;
Kottlors, Jonathan .
RADIOLOGY, 2024, 311 (01)
[8]   Enhancing Patient Communication With Chat-GPT in Radiology: Evaluating the Efficacy and Readability of Answers to Common Imaging-Related Questions [J].
Gordon, Emile B. ;
Towbin, Alexander J. ;
Wingrove, Peter ;
Sha, Umber ;
Haas, Brian ;
Kitts, Andrea B. ;
Feldman, Jill ;
Furlan, Alessandro .
JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2024, 21 (02) :353-359
[9]   Using GPT-4 for LI-RADS feature extraction and categorization with multilingual free-text reports [J].
Gu, Kyowon ;
Lee, Jeong Hyun ;
Shin, Jaeseung ;
Hwang, Jeong Ah ;
Min, Ji Hye ;
Jeong, Woo Kyoung ;
Lee, Min Woo ;
Song, Kyoung Doo ;
Bae, Sung Hwan .
LIVER INTERNATIONAL, 2024, 44 (07) :1578-1587
[10]   Harnessing Large Language Models for Structured Reporting in Breast Ultrasound: A Comparative Study of Open AI (GPT-4.0) and Microsoft Bing (GPT-4) [J].
Liu, ChaoXu ;
Wei, MinYan ;
Qin, Yu ;
Zhang, MeiXiang ;
Jiang, Huan ;
Xu, JiaLe ;
Zhang, YuNing ;
Hua, Qing ;
Hou, YiQing ;
Dong, YiJie ;
Xia, ShuJun ;
Li, Ning ;
Zhou, JianQiao .
ULTRASOUND IN MEDICINE AND BIOLOGY, 2024, 50 (11) :1697-1703