Evaluation of ChatGPT and Google Bard Using Prompt Engineering in Cancer Screening Algorithms

被引:11
作者
Nguyen, Daniel [1 ]
Swanson, Daniel [1 ]
Newbury, Alex [2 ]
Kim, Young H. [2 ]
机构
[1] Univ Massachusetts, Chan Med Sch, Worcester, MA 01655 USA
[2] Univ Massachusetts, Chan Sch Med, Dept Radiol, Worcester, MA USA
关键词
D O I
10.1016/j.acra.2023.11.002
中图分类号
R8 [特种医学]; R445 [影像诊断学];
学科分类号
1002 ; 100207 ; 1009 ;
摘要
Large language models (LLMs) such as ChatGPT and Bard have emerged as powerful tools in medicine, showcasing strong results in tasks such as radiology report translations and research paper drafting. While their implementation in clinical practice holds promise, their response accuracy remains variable. This study aimed to evaluate the accuracy of ChatGPT and Bard in clinical decision-making based on the American College of Radiology Appropriateness Criteria for various cancers. Both LLMs were evaluated in terms of their responses to open-ended (OE) and select-all-that-apply (SATA) prompts. Furthermore, the study incorporated prompt engineering (PE) techniques to enhance the accuracy of LLM outputs. The results revealed similar performances between ChatGPT and Bard on OE prompts, with ChatGPT exhibiting marginally higher accuracy in SATA scenarios. The introduction of PE also marginally improved LLM outputs in OE prompts but did not enhance SATA responses. The results highlight the potential of LLMs in aiding clinical decisionmaking processes, especially when guided by optimally engineered prompts. Future studies in diverse clinical situations are imperative to better understand the impact of LLMs in radiology. (c) 2024 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.
引用
收藏
页码:1799 / 1804
页数:6
相关论文
共 12 条
  • [1] Bommarito II M., 2022, arXiv, DOI DOI 10.48550/ARXIV.2212.14402
  • [2] Ekin S, 2023, TechRxic
  • [3] Prompt Engineering with ChatGPT: A Guide for Academic Writers
    Giray, Louie
    [J]. ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (12) : 2629 - 2633
  • [4] Appropriateness of Breast Cancer Prevention and Screening Recommendations Provided by ChatGPT
    Haver, Hana L.
    Ambinder, Emily B.
    Bahl, Manisha
    Oluyemi, Eniola T.
    Jeudy, Jean
    Yi, Paul H.
    [J]. RADIOLOGY, 2023, 307 (04)
  • [5] Kung Tiffany H, 2023, PLOS Digit Health, V2, pe0000198, DOI 10.1371/journal.pdig.0000198
  • [6] Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine
    Lee, Peter
    Bubeck, Sebastien
    Petro, Joseph
    [J]. NEW ENGLAND JOURNAL OF MEDICINE, 2023, 388 (13) : 1233 - 1239
  • [7] Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential
    Lyu, Qing
    Tan, Josh
    Zapadka, Michael E.
    Ponnatapura, Janardhana
    Niu, Chuang
    Myers, Kyle J.
    Wang, Ge
    Whitlow, Christopher T.
    [J]. VISUAL COMPUTING FOR INDUSTRY BIOMEDICINE AND ART, 2023, 6 (01)
  • [8] Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis
    Macdonald, Calum
    Adeloye, Davies
    Sheikh, Aziz
    Rudan, Igor
    [J]. JOURNAL OF GLOBAL HEALTH, 2023, 13
  • [9] Pichai S., 2019, Google
  • [10] Rao ARY, 2023, medRxiv, DOI [10.1101/2023.02.02.23285399, 10.1101/2023.02.02.23285399, DOI 10.1101/2023.02.02.23285399]