Harnessing Large Language Models for Structured Reporting in Breast Ultrasound: A Comparative Study of Open AI (GPT-4.0) and Microsoft Bing (GPT-4)

被引:5
作者
Liu, ChaoXu [1 ,2 ]
Wei, MinYan [1 ,2 ]
Qin, Yu [1 ,2 ]
Zhang, MeiXiang [1 ,2 ]
Jiang, Huan [1 ,2 ]
Xu, JiaLe [1 ,2 ]
Zhang, YuNing [1 ,2 ]
Hua, Qing [1 ,2 ]
Hou, YiQing [1 ,2 ]
Dong, YiJie [1 ,2 ]
Xia, ShuJun [1 ,2 ]
Li, Ning [3 ]
Zhou, JianQiao [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Ruijin Hosp, Dept Ultrasound, Sch Med, 197 Ruijin Er Rd, Shanghai 200025, Peoples R China
[2] Shanghai Jiao Tong Univ, Coll Hlth Sci & Technol, Sch Med, Shanghai, Peoples R China
[3] Dali Univ, Affiliated Hosp 7, Yunnan Kungang Hosp, Dept Ultrasound, Anning, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Ultrasound; BIRADS; Large language models; GPT-4; Breast cancer; Reporting; Performance;
D O I
10.1016/j.ultrasmedbio.2024.07.007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Objectives To assess the capabilities of large language models (LLMs), including Open AI (GPT-4.0) and Microsoft Bing (GPT-4), in generating structured reports, the Breast Imaging Reporting and Data System (BI-RADS) categories, and management recommendations from free-text breast ultrasound reports. Materials and Methods In this retrospective study, 100 free-text breast ultrasound reports from patients who underwent surgery between January and May 2023 were gathered. The capabilities of Open AI (GPT-4.0) and Microsoft Bing (GPT-4) to convert these unstructured reports into structured ultrasound reports were studied. The quality of structured reports, BI-RADS categories, and management recommendations generated by GPT-4.0 and Bing were evaluated by senior radiologists based on the guidelines. Results Open AI (GPT-4.0) was better than Microsoft Bing (GPT-4) in terms of performance in generating structured reports (88% vs. 55%; p < 0.001), giving correct BI-RADS categories (54% vs. 47%; p = 0.013) and providing reasonable management recommendations (81% vs. 63%; p < 0.001). As the ability to predict benign and malignant characteristics, GPT-4.0 performed significantly better than Bing (AUC, 0.9317 vs. 0.8177; p < 0.001), while both performed significantly inferior to senior radiologists (AUC, 0.9763; both p < 0.001). Conclusion This study highlights the potential of LLMs, specifically Open AI (GPT-4.0), in converting unstructured breast ultrasound reports into structured ones, offering accurate diagnoses and providing reasonable recommendations.
引用
收藏
页码:1697 / 1703
页数:7
相关论文
共 24 条
[1]   Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study [J].
Adams, Lisa C. ;
Truhn, Daniel ;
Busch, Felix ;
Kader, Avan ;
Niehues, Stefan M. ;
Makowski, Marcus R. ;
Bressem, Keno K. .
RADIOLOGY, 2023, 307 (04)
[2]   The ethical, legal and social implications of using artificial intelligence systems in breast cancer care [J].
Carter, Stacy M. ;
Rogers, Wendy ;
Win, Khin Than ;
Frazer, Helen ;
Richards, Bernadette ;
Houssami, Nehmat .
BREAST, 2020, 49 :25-32
[3]  
CASCADE PN, 1994, RADIOLOGY, V192, pA50
[4]   BI-RADS Category Assignments by GPT-3.5, GPT-4, and Google Bard: A Multilanguage Study [J].
Cozzi, Andrea ;
Pinker, Katja ;
Hidber, Andri ;
Zhang, Tianyu ;
Bonomo, Luca ;
Lo Gullo, Roberto ;
Christianson, Blake ;
Curti, Marco ;
Rizzo, Stefania ;
Del Grande, Filippo ;
Mann, Ritse M. ;
Schiaffino, Simone .
RADIOLOGY, 2024, 311 (01)
[5]   ABSTRACTS WRITTEN BY CHATGPT FOOL SCIENTISTS [J].
Else, Holly .
NATURE, 2023, 613 (7944) :423-423
[6]   European Society of Radiology (ESR) and American College of Radiology (ACR) report of the 2015 global summit on radiological quality and safety [J].
European Society of Radiology ;
American College of Radiology .
INSIGHTS INTO IMAGING, 2016, 7 (04) :481-484
[7]   The Role of Large Language Models (LLMs) in Providing Triage for Maxillofacial Trauma Cases: A Preliminary Study [J].
Frosolini, Andrea ;
Catarzi, Lisa ;
Benedetti, Simone ;
Latini, Linda ;
Chisci, Glauco ;
Franz, Leonardo ;
Gennaro, Paolo ;
Gabriele, Guido .
DIAGNOSTICS, 2024, 14 (08)
[8]   Breast Cancer Statistics, 2022 [J].
Giaquinto, Angela N. ;
Sung, Hyuna ;
Miller, Kimberly D. ;
Kramer, Joan L. ;
Newman, Lisa A. ;
Minihan, Adair ;
Jemal, Ahmedin ;
Siegel, Rebecca L. .
CA-A CANCER JOURNAL FOR CLINICIANS, 2022, 72 (06) :524-541
[9]   Prompt Engineering with ChatGPT: A Guide for Academic Writers [J].
Giray, Louie .
ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (12) :2629-2633
[10]   Advancing medical imaging with language models: featuring a spotlight on ChatGPT [J].
Hu, Mingzhe ;
Qian, Joshua ;
Pan, Shaoyan ;
Li, Yuheng ;
Qiu, Richard L. J. ;
Yang, Xiaofeng .
PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (10)