Evaluation of ChatGPT and Google Bard Using Prompt Engineering in Cancer Screening Algorithms

被引：11

作者：

Nguyen, Daniel ^{[1
]}

Swanson, Daniel ^{[1
]}

Newbury, Alex ^{[2
]}

Kim, Young H. ^{[2
]}

机构：

[1] Univ Massachusetts, Chan Med Sch, Worcester, MA 01655 USA

[2] Univ Massachusetts, Chan Sch Med, Dept Radiol, Worcester, MA USA

来源：

ACADEMIC RADIOLOGY | 2024年 / 31卷 / 05期

关键词：

D O I：

10.1016/j.acra.2023.11.002

中图分类号：

R8 [特种医学]; R445 [影像诊断学];

学科分类号：

1002 ; 100207 ; 1009 ;

摘要：

Large language models (LLMs) such as ChatGPT and Bard have emerged as powerful tools in medicine, showcasing strong results in tasks such as radiology report translations and research paper drafting. While their implementation in clinical practice holds promise, their response accuracy remains variable. This study aimed to evaluate the accuracy of ChatGPT and Bard in clinical decision-making based on the American College of Radiology Appropriateness Criteria for various cancers. Both LLMs were evaluated in terms of their responses to open-ended (OE) and select-all-that-apply (SATA) prompts. Furthermore, the study incorporated prompt engineering (PE) techniques to enhance the accuracy of LLM outputs. The results revealed similar performances between ChatGPT and Bard on OE prompts, with ChatGPT exhibiting marginally higher accuracy in SATA scenarios. The introduction of PE also marginally improved LLM outputs in OE prompts but did not enhance SATA responses. The results highlight the potential of LLMs in aiding clinical decisionmaking processes, especially when guided by optimally engineered prompts. Future studies in diverse clinical situations are imperative to better understand the impact of LLMs in radiology. (c) 2024 The Association of University Radiologists. Published by Elsevier Inc. All rights reserved.

引用

页码：1799 / 1804

页数：6

共 12 条

[1] Bommarito II M., 2022, arXiv, DOI DOI 10.48550/ARXIV.2212.14402
[2] Ekin S, 2023, TechRxic
[3] Prompt Engineering with ChatGPT: A Guide for Academic Writers
Giray, Louie
[J]. ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (12) : 2629 - 2633
[4] Appropriateness of Breast Cancer Prevention and Screening Recommendations Provided by ChatGPT
Haver, Hana L.
Ambinder, Emily B.
Bahl, Manisha
Oluyemi, Eniola T.
Jeudy, Jean
Yi, Paul H.
[J]. RADIOLOGY, 2023, 307 (04)
[5] Kung Tiffany H, 2023, PLOS Digit Health, V2, pe0000198, DOI 10.1371/journal.pdig.0000198
[6] Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine
Lee, Peter
Bubeck, Sebastien
Petro, Joseph
[J]. NEW ENGLAND JOURNAL OF MEDICINE, 2023, 388 (13) : 1233 - 1239
[7] Translating radiology reports into plain language using ChatGPT and GPT-4 with prompt learning: results, limitations, and potential
Lyu, Qing
Tan, Josh
Zapadka, Michael E.
Ponnatapura, Janardhana
Niu, Chuang
Myers, Kyle J.
Wang, Ge
Whitlow, Christopher T.
[J]. VISUAL COMPUTING FOR INDUSTRY BIOMEDICINE AND ART, 2023, 6 (01)
[8] Can ChatGPT draft a research article? An example of population-level vaccine effectiveness analysis
Macdonald, Calum
Adeloye, Davies
Sheikh, Aziz
Rudan, Igor
[J]. JOURNAL OF GLOBAL HEALTH, 2023, 13
[9] Pichai S., 2019, Google
[10] Rao ARY, 2023, medRxiv, DOI [10.1101/2023.02.02.23285399, 10.1101/2023.02.02.23285399, DOI 10.1101/2023.02.02.23285399]

← 1 2 →