Assessing the Responses of Large Language Models (ChatGPT-4, Gemini, and Microsoft Copilot) to Frequently Asked Questions in Breast Imaging: A Study on Readability and Accuracy

被引:13
作者
Tepe, Murat [1 ]
Emekli, Emre [2 ]
机构
[1] Mediclin City Hosp, Radiol, Dubai, U Arab Emirates
[2] Eskisehir Osmangazi Univ, Hlth Practice & Res Hosp, Radiol, Eskisehir, Turkiye
关键词
artificial intelligence; breast imaging; microsoft copilot; gemini; chatgpt; large language models;
D O I
10.7759/cureus.59960
中图分类号
R5 [内科学];
学科分类号
1002 ; 100201 ;
摘要
Background Large language models (LLMs), such as ChatGPT-4, Gemini, and Microsoft Copilot, have been instrumental in various domains, including healthcare, where they enhance health literacy and aid in patient decisionmaking. Given the complexities involved in breast imaging procedures, accurate and comprehensible information is vital for patient engagement and compliance. This study aims to evaluate the readability and accuracy of the information provided by three prominent LLMs, ChatGPT-4, Gemini, and Microsoft Copilot, in response to frequently asked questions in breast imaging, assessing their potential to improve patient understanding and facilitate healthcare communication. Methodology We collected the most common questions on breast imaging from clinical practice and posed them to LLMs. We then evaluated the responses in terms of readability and accuracy. Responses from LLMs were analyzed for readability using the Flesch Reading Ease and Flesch-Kincaid Grade Level tests and for accuracy through a radiologist -developed Likert-type scale. Results The study found significant variations among LLMs. Gemini and Microsoft Copilot scored higher on readability scales (p < 0.001), indicating their responses were easier to understand. In contrast, ChatGPT-4 demonstrated greater accuracy in its responses (p < 0.001). Conclusions While LLMs such as ChatGPT-4 show promise in providing accurate responses, readability issues may limit their utility in patient education. Conversely, Gemini and Microsoft Copilot, despite being less accurate, are more accessible to a broader patient audience. Ongoing adjustments and evaluations of these models are essential to ensure they meet the diverse needs of patients, emphasizing the need for continuous improvement and oversight in the deployment of artificial intelligence technologies in healthcare.
引用
收藏
页数:9
相关论文
共 23 条
  • [1] [Anonymous], 2024, Google Gemini
  • [2] [Anonymous], 2024, Microsoft Copilot
  • [3] The association between adherence to cancer screening programs and health literacy: A systematic review and meta-analysis
    Baccolini, Valentina
    Isonne, Claudia
    Salerno, Carla
    Giffi, Monica
    Migliara, Giuseppe
    Mazzalai, Elena
    Turatto, Federica
    Sinopoli, Alessandra
    Rosso, Annalisa
    De Vito, Corrado
    Marzuillo, Carolina
    Villari, Paolo
    [J]. PREVENTIVE MEDICINE, 2022, 155
  • [4] Large language models in radiology: fundamentals, applications, ethical considerations, risks, and future directions
    D'Antonoli, Tugba Akinci
    Stanzione, Arnaldo
    Bluethgen, Christian
    Vernuccio, Federica
    Ugga, Lorenzo
    Klontzas, Michail E.
    Cuocolo, Renato
    Cannella, Roberto
    Kocak, Burak
    [J]. DIAGNOSTIC AND INTERVENTIONAL RADIOLOGY, 2024, 30 (02): : 80 - 90
  • [5] Evaluating Large Language Models for the National Premedical Exam in India: Comparative Analysis of GPT-3.5, GPT-4, and Bard
    Farhat, Faiza
    Chaudhry, Beenish Moalla
    Nadeem, Mohammad
    Sohail, Shahab Saquib
    Madsen, Dag Oivind
    [J]. JMIR MEDICAL EDUCATION, 2024, 10
  • [6] A New Readability Yardstick
    Flesch, Rudolf
    [J]. JOURNAL OF APPLIED PSYCHOLOGY, 1948, 32 (03) : 221 - 233
  • [7] Evaluating the Use of ChatGPT to Accurately Simplify Patient-centered Information about Breast Cancer Prevention and Screening
    Haver, Hana L.
    Gupta, Anuj K.
    Ambinder, Emily B.
    Bahl, Manisha
    Oluyemi, Eniola T.
    Jeudy, Jean
    Yi, Paul H.
    [J]. RADIOLOGY-IMAGING CANCER, 2024, 6 (02):
  • [8] Accuracy and comprehensibility of chat-based artificial intelligence for patient information on atrial fibrillation and cardiac implantable electronic devices
    Hillmann, Henrike A. K.
    Angelini, Eleonora
    Karfoul, Nizar
    Feickert, Sebastian
    Mueller-Leisse, Johanna
    Duncker, David
    [J]. EUROPACE, 2023, 26 (01):
  • [9] Benchmarking ChatGPT-4 on a radiation oncology in-training exam and Red Journal Gray Zone cases: potentials and challenges for ai-assisted medical education and decision making in radiation oncology
    Huang, Yixing
    Gomaa, Ahmed
    Semrau, Sabine
    Haderlein, Marlen
    Lettmaier, Sebastian
    Weissmann, Thomas
    Grigo, Johanna
    Tkhayat, Hassen Ben
    Frey, Benjamin
    Gaipl, Udo
    Distel, Luitpold
    Maier, Andreas
    Fietkau, Rainer
    Bert, Christoph
    Putz, Florian
    [J]. FRONTIERS IN ONCOLOGY, 2023, 13
  • [10] New Horizons: The Potential Role of OpenAI's ChatGPT in Clinical Radiology
    Ismail, Ahmed
    Ghorashi, Nima S.
    Javan, Ramin
    [J]. JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2023, 20 (07) : 696 - 698