Large language models as assistance for glaucoma surgical cases: a ChatGPT vs. Google Gemini comparison

被引:29
作者
Carla, Matteo Mario [1 ,2 ]
Gambini, Gloria [1 ,2 ]
Baldascino, Antonio [1 ,2 ]
Boselli, Francesco [1 ,2 ]
Giannuzzi, Federico [1 ,2 ]
Margollicci, Fabio [1 ,2 ]
Rizzo, Stanislao [1 ,2 ]
机构
[1] Fdn Policlin Univ A Gemelli, Ophthalmol Dept, IRCCS, I-00168 Rome, Italy
[2] Univ Cattolica Sacro Cuore, Ophthalmol Dept, Largo A Gemelli 8, Rome, Italy
关键词
Large language models (LLM); ChatGPT; Google Gemini; Google Bard; Glaucoma; Artificial intelligence (AI); Glaucoma surgery; OPEN-ANGLE GLAUCOMA;
D O I
10.1007/s00417-024-06470-5
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Purpose The aim of this study was to define the capability of ChatGPT-4 and Google Gemini in analyzing detailed glaucoma case descriptions and suggesting an accurate surgical plan.Methods Retrospective analysis of 60 medical records of surgical glaucoma was divided into "ordinary" (n = 40) and "challenging" (n = 20) scenarios. Case descriptions were entered into ChatGPT and Bard's interfaces with the question "What kind of surgery would you perform?" and repeated three times to analyze the answers' consistency. After collecting the answers, we assessed the level of agreement with the unified opinion of three glaucoma surgeons. Moreover, we graded the quality of the responses with scores from 1 (poor quality) to 5 (excellent quality), according to the Global Quality Score (GQS) and compared the results.Results ChatGPT surgical choice was consistent with those of glaucoma specialists in 35/60 cases (58%), compared to 19/60 (32%) of Gemini (p = 0.0001). Gemini was not able to complete the task in 16 cases (27%). Trabeculectomy was the most frequent choice for both chatbots (53% and 50% for ChatGPT and Gemini, respectively). In "challenging" cases, ChatGPT agreed with specialists in 9/20 choices (45%), outperforming Google Gemini performances (4/20, 20%). Overall, GQS scores were 3.5 +/- 1.2 and 2.1 +/- 1.5 for ChatGPT and Gemini (p = 0.002). This difference was even more marked if focusing only on "challenging" cases (1.5 +/- 1.4 vs. 3.0 +/- 1.5, p = 0.001).Conclusion ChatGPT-4 showed a good analysis performance for glaucoma surgical cases, either ordinary or challenging. On the other side, Google Gemini showed strong limitations in this setting, presenting high rates of unprecise or missed answers.
引用
收藏
页码:2945 / 2959
页数:15
相关论文
共 36 条
[1]  
Ali Rohaid, 2023, Neurosurgery, V93, P1090, DOI 10.1227/neu.0000000000002551
[2]  
Alser, 2023, AM J MED OPEN, V9, P1
[3]   Evaluating the Performance of ChatGPT in Ophthalmology [J].
Antaki, Fares ;
Touma, Samir ;
Milad, Daniel ;
El -Khoury, Jonathan ;
Duval, Renaud .
OPHTHALMOLOGY SCIENCE, 2023, 3 (04)
[4]   A systematic review of patient inflammatory bowel disease information resources on the world wide web [J].
Bernard, Andre ;
Langille, Morgan ;
Hughes, Stephanie ;
Rose, Caren ;
Leddin, Desmond ;
van Zanten, Sander Veldhuyzen .
AMERICAN JOURNAL OF GASTROENTEROLOGY, 2007, 102 (09) :2070-2077
[5]   Evolving Surgical Interventions in the Treatment of Glaucoma [J].
Bovee, Courtney E. ;
Pasquale, Louis R. .
SEMINARS IN OPHTHALMOLOGY, 2017, 32 (01) :91-95
[6]  
Brown TB, 2020, ADV NEUR IN, V33
[7]   Exploring AI-chatbots' capability to suggest surgical planning in ophthalmology: ChatGPT versus Google Gemini analysis of retinal detachment cases [J].
Carla, Matteo Mario ;
Gambini, Gloria ;
Baldascino, Antonio ;
Giannuzzi, Federico ;
Boselli, Francesco ;
Crincoli, Emanuele ;
D'Onofrio, Nicola Claudio ;
Rizzo, Stanislao .
BRITISH JOURNAL OF OPHTHALMOLOGY, 2024, 108 (10) :1457-1469
[8]   ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations [J].
Dave, Tirth ;
Athaluri, Sai Anirudh ;
Singh, Satyam .
FRONTIERS IN ARTIFICIAL INTELLIGENCE, 2023, 6
[9]   The Use of ChatGPT to Assist in Diagnosing Glaucoma Based on Clinical Case Reports [J].
Delsoz, Mohammad ;
Raja, Hina ;
Madadi, Yeganeh ;
Tang, Anthony A. ;
Wirostko, Barbara M. ;
Kahook, Malik Y. ;
Yousefi, Siamak .
OPHTHALMOLOGY AND THERAPY, 2023, 12 (06) :3121-3132
[10]   Priorities for successful use of artificial intelligence by public health organizations: a literature review [J].
Fisher, Stacey ;
Rosella, Laura C. .
BMC PUBLIC HEALTH, 2022, 22 (01)