Harnessing Large Language Models for Structured Reporting in Breast Ultrasound: A Comparative Study of Open AI (GPT-4.0) and Microsoft Bing (GPT-4)

被引:1
作者
Liu, ChaoXu [1 ,2 ]
Wei, MinYan [1 ,2 ]
Qin, Yu [1 ,2 ]
Zhang, MeiXiang [1 ,2 ]
Jiang, Huan [1 ,2 ]
Xu, JiaLe [1 ,2 ]
Zhang, YuNing [1 ,2 ]
Hua, Qing [1 ,2 ]
Hou, YiQing [1 ,2 ]
Dong, YiJie [1 ,2 ]
Xia, ShuJun [1 ,2 ]
Li, Ning [3 ]
Zhou, JianQiao [1 ,2 ]
机构
[1] Shanghai Jiao Tong Univ, Ruijin Hosp, Dept Ultrasound, Sch Med, 197 Ruijin Er Rd, Shanghai 200025, Peoples R China
[2] Shanghai Jiao Tong Univ, Coll Hlth Sci & Technol, Sch Med, Shanghai, Peoples R China
[3] Dali Univ, Affiliated Hosp 7, Yunnan Kungang Hosp, Dept Ultrasound, Anning, Yunnan, Peoples R China
基金
中国国家自然科学基金;
关键词
Ultrasound; BIRADS; Large language models; GPT-4; Breast cancer; Reporting; Performance;
D O I
10.1016/j.ultrasmedbio.2024.07.007
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Objectives To assess the capabilities of large language models (LLMs), including Open AI (GPT-4.0) and Microsoft Bing (GPT-4), in generating structured reports, the Breast Imaging Reporting and Data System (BI-RADS) categories, and management recommendations from free-text breast ultrasound reports. Materials and Methods In this retrospective study, 100 free-text breast ultrasound reports from patients who underwent surgery between January and May 2023 were gathered. The capabilities of Open AI (GPT-4.0) and Microsoft Bing (GPT-4) to convert these unstructured reports into structured ultrasound reports were studied. The quality of structured reports, BI-RADS categories, and management recommendations generated by GPT-4.0 and Bing were evaluated by senior radiologists based on the guidelines. Results Open AI (GPT-4.0) was better than Microsoft Bing (GPT-4) in terms of performance in generating structured reports (88% vs. 55%; p < 0.001), giving correct BI-RADS categories (54% vs. 47%; p = 0.013) and providing reasonable management recommendations (81% vs. 63%; p < 0.001). As the ability to predict benign and malignant characteristics, GPT-4.0 performed significantly better than Bing (AUC, 0.9317 vs. 0.8177; p < 0.001), while both performed significantly inferior to senior radiologists (AUC, 0.9763; both p < 0.001). Conclusion This study highlights the potential of LLMs, specifically Open AI (GPT-4.0), in converting unstructured breast ultrasound reports into structured ones, offering accurate diagnoses and providing reasonable recommendations.
引用
收藏
页码:1697 / 1703
页数:7
相关论文
共 24 条
  • [1] Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study
    Adams, Lisa C.
    Truhn, Daniel
    Busch, Felix
    Kader, Avan
    Niehues, Stefan M.
    Makowski, Marcus R.
    Bressem, Keno K.
    [J]. RADIOLOGY, 2023, 307 (04)
  • [2] The ethical, legal and social implications of using artificial intelligence systems in breast cancer care
    Carter, Stacy M.
    Rogers, Wendy
    Win, Khin Than
    Frazer, Helen
    Richards, Bernadette
    Houssami, Nehmat
    [J]. BREAST, 2020, 49 : 25 - 32
  • [3] CASCADE PN, 1994, RADIOLOGY, V192, pA50
  • [4] BI-RADS Category Assignments by GPT-3.5, GPT-4, and Google Bard: A Multilanguage Study
    Cozzi, Andrea
    Pinker, Katja
    Hidber, Andri
    Zhang, Tianyu
    Bonomo, Luca
    Lo Gullo, Roberto
    Christianson, Blake
    Curti, Marco
    Rizzo, Stefania
    Del Grande, Filippo
    Mann, Ritse M.
    Schiaffino, Simone
    [J]. RADIOLOGY, 2024, 311 (01)
  • [5] ABSTRACTS WRITTEN BY CHATGPT FOOL SCIENTISTS
    Else, Holly
    [J]. NATURE, 2023, 613 (7944) : 423 - 423
  • [6] European Society of Radiology (ESR) and American College of Radiology (ACR) report of the 2015 global summit on radiological quality and safety
    European Society of Radiology
    American College of Radiology
    [J]. INSIGHTS INTO IMAGING, 2016, 7 (04): : 481 - 484
  • [7] The Role of Large Language Models (LLMs) in Providing Triage for Maxillofacial Trauma Cases: A Preliminary Study
    Frosolini, Andrea
    Catarzi, Lisa
    Benedetti, Simone
    Latini, Linda
    Chisci, Glauco
    Franz, Leonardo
    Gennaro, Paolo
    Gabriele, Guido
    [J]. DIAGNOSTICS, 2024, 14 (08)
  • [8] Breast Cancer Statistics, 2022
    Giaquinto, Angela N.
    Sung, Hyuna
    Miller, Kimberly D.
    Kramer, Joan L.
    Newman, Lisa A.
    Minihan, Adair
    Jemal, Ahmedin
    Siegel, Rebecca L.
    [J]. CA-A CANCER JOURNAL FOR CLINICIANS, 2022, 72 (06) : 524 - 541
  • [9] Prompt Engineering with ChatGPT: A Guide for Academic Writers
    Giray, Louie
    [J]. ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (12) : 2629 - 2633
  • [10] Advancing medical imaging with language models: featuring a spotlight on ChatGPT
    Hu, Mingzhe
    Qian, Joshua
    Pan, Shaoyan
    Li, Yuheng
    Qiu, Richard L. J.
    Yang, Xiaofeng
    [J]. PHYSICS IN MEDICINE AND BIOLOGY, 2024, 69 (10)