A COMPARATIVE STUDY: PERFORMANCE OF LARGE LANGUAGE MODELS IN SIMPLIFYING TURKISH COMPUTED TOMOGRAPHY REPORTS

被引:0
作者
Camur, Eren [1 ]
Cesur, Turay [2 ]
Gunes, Yasin Celal [3 ]
机构
[1] Ankara 29 Mayis State Hosp, Dept Radiol, Minist Hlth, Ankara, Turkiye
[2] Mamak State Hosp, Dept Radiol, Minist Hlth, Ankara, Turkiye
[3] Kirikkale Yuksek Ihtisas Hosp, Dept Radiol, Minist Hlth, Kirikkale, Turkiye
来源
JOURNAL OF ISTANBUL FACULTY OF MEDICINE-ISTANBUL TIP FAKULTESI DERGISI | 2024年 / 87卷 / 04期
关键词
Large language model; radiology reports; readability; computed tomography; Turkish; simplifying;
D O I
10.26650/IUITFD.1494572
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Objective: This study evaluated the effectiveness of various large language models (LLMs) in simplifying Turkish Computed Tomograpghy (CT) reports, a common imaging modality. Material and Method: Using fictional CT findings, we followed the Standards for Reporting of Diagnostic Accuracy Studies (STARD) and the Declaration of Helsinki. Fifty fictional Turkish CT findings were generated. Four LLMs (ChatGPT 4, ChatGPT-3.5, Gemini 1.5 Pro, and Claude 3 Opus) simplified reports using the prompt: "Please explain them in a way that someone without a medical background can understand in Turkish." Evaluations were based on the Ate man & sacute; Readability Index and Likert scale for accuracy and readability. Results: Claude 3 Opus scored the highest in readability (58.9), followed by ChatGPT-3.5 (54.5), Gemini 1.5 Pro (53.7), and ChatGPT 4 (45.1). Likert scores for Claude 3 Opus (mean: 4.7) and ChatGPT 4 (mean: 4.5) showed no significant difference (p>0.05). ChatGPT 4 had the highest word count (96.98) compared to Claude 3 Opus (90.6), Gemini 1.5 Pro (74.4), and ChatGPT-3.5 (38.7) (p<0.001). Conclusion: This study shows that LLMs can simplify Turkish CT reports at a level that individuals without medical knowledge can understand and with high readability and accuracy. ChatGPT 4 and Claude 3 Opus produced the most comprehensible simplifications. Claude 3 Opus' simpler sentences may make it the optimal choice for simplifying Turkish CT reports.
引用
收藏
页码:321 / 326
页数:6
相关论文
共 14 条
  • [1] Atesman E., 1997, Dil Dergisi, V58, P71
  • [2] Bossuyt PM, 2015, BMJ-BRIT MED J, V351, DOI [10.1136/bmj.h5527, 10.1373/clinchem.2015.246280, 10.1148/radiol.2015151516]
  • [3] Large language models in radiology: fundamentals, applications, ethical considerations, risks, and future directions
    D'Antonoli, Tugba Akinci
    Stanzione, Arnaldo
    Bluethgen, Christian
    Vernuccio, Federica
    Ugga, Lorenzo
    Klontzas, Michail E.
    Cuocolo, Renato
    Cannella, Roberto
    Kocak, Burak
    [J]. DIAGNOSTIC AND INTERVENTIONAL RADIOLOGY, 2024, 30 (02): : 80 - 90
  • [4] Doshi R, 2023, MEDRXIV, DOI [10.1101/2023.06.04.23290786v2, DOI 10.1101/2023.06.04.23290786V2]
  • [5] Guadalupe Ramos J, 2019, RES COMPUTING SCI, V148, P11
  • [6] ChatGPT makes medicine easy to swallow: an exploratory case study on simplified radiology reports
    Jeblick, Katharina
    Schachtner, Balthasar
    Dexl, Jakob
    Mittermeier, Andreas
    Stueber, Anna Theresa
    Topalis, Johanna
    Weber, Tobias
    Wesp, Philipp
    Sabel, Bastian Oliver
    Ricke, Jens
    Ingrisch, Michael
    [J]. EUROPEAN RADIOLOGY, 2024, 34 (05) : 2817 - 2825
  • [7] Johnson AEW, 2023, SCI DATA, V10, DOI 10.1038/s41597-022-01899-x
  • [8] Kung TH, 2023, PLOS DIGIT HLTH, V2
  • [9] Decoding radiology reports: Potential application of OpenAI ChatGPT to enhance patient understanding of diagnostic reports
    Li , Hanzhou
    Moon, John T.
    Iyer, Deepak
    Balthazar, Patricia
    Krupinski, Elizabeth A.
    Bercu, Zachary L.
    Newsome, Janice M.
    Banerjee, Imon
    Gichoya, Judy W.
    Trivedi, Hari M.
    [J]. CLINICAL IMAGING, 2023, 101 : 137 - 141
  • [10] A novel ILP framework for summarizing content with high lexical variety
    Luo, Wencan
    Liu, Fei
    Liu, Zitao
    Litman, Diane
    [J]. NATURAL LANGUAGE ENGINEERING, 2018, 24 (06) : 887 - 920