Exploring AI-chatbots' capability to suggest surgical planning in ophthalmology: ChatGPT versus Google Gemini analysis of retinal detachment cases

被引:32
作者
Carla, Matteo Mario [1 ,2 ]
Gambini, Gloria [1 ,2 ]
Baldascino, Antonio [1 ,2 ]
Giannuzzi, Federico [1 ,2 ]
Boselli, Francesco [1 ,2 ]
Crincoli, Emanuele [1 ,2 ]
D'Onofrio, Nicola Claudio [1 ,2 ]
Rizzo, Stanislao [1 ,2 ]
机构
[1] Univ Cattolica Sacro Cuore, Ophthalmol Dept, Rome, Italy
[2] Fdn Policlin Univ A Gemelli, Ophthalmol Dept, IRCCS, Rome, Italy
关键词
Retina; Vitreous; Medical Education; Ophthalmologic Surgical Procedures; Surveys and Questionnaires;
D O I
10.1136/bjo-2023-325143
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Background We aimed to define the capability of three different publicly available large language models, Chat Generative Pretrained Transformer (ChatGPT-3.5), ChatGPT-4 and Google Gemini in analysing retinal detachment cases and suggesting the best possible surgical planning. Methods Analysis of 54 retinal detachments records entered into ChatGPT and Gemini's interfaces. After asking 'Specify what kind of surgical planning you would suggest and the eventual intraocular tamponade.' and collecting the given answers, we assessed the level of agreement with the common opinion of three expert vitreoretinal surgeons. Moreover, ChatGPT and Gemini answers were graded 1-5 (from poor to excellent quality), according to the Global Quality Score (GQS). Results After excluding 4 controversial cases, 50 cases were included. Overall, ChatGPT-3.5, ChatGPT- 4 and Google Gemini surgical choices agreed with those of vitreoretinal surgeons in 40/50 (80%), 42/50 (84%) and 35/ 50 (70%) of cases. Google Gemini was not able to respond in five cases. Contingency analysis showed significant differences between ChatGPT-4 and Gemini (p=0.03). ChatGPT's GQS were 3.9 +/- 0.8 and 4.2 +/- 0.7 for versions 3.5 and 4, while Gemini scored 3.5 +/- 1.1. There was no statistical difference between the two ChatGPTs (p=0.22), while both outperformed Gemini scores (p=0.03 and p=0.002, respectively). The main source of error was endotamponade choice (14% for ChatGPT-3.5 and 4, and 12% for Google Gemini). Only ChatGPT-4 was able to suggest a combined phacovitrectomy approach. Conclusion In conclusion, Google Gemini and ChatGPT evaluated vitreoretinal patients' records in a coherent manner, showing a good level of agreement with expert surgeons. According to the GQS, ChatGPT's recommendations were much more accurate and precise.
引用
收藏
页码:1457 / 1469
页数:13
相关论文
共 27 条
[1]  
Ali Rohaid, 2023, Neurosurgery, V93, P1090, DOI 10.1227/neu.0000000000002551
[2]  
Alser Muath, 2023, Am J Med Open, V9, P100036, DOI 10.1016/j.ajmo.2023.100036
[3]  
[Anonymous], 2021, INT C MACHINE LEARNI
[4]   Evaluating the Performance of ChatGPT in Ophthalmology [J].
Antaki, Fares ;
Touma, Samir ;
Milad, Daniel ;
El -Khoury, Jonathan ;
Duval, Renaud .
OPHTHALMOLOGY SCIENCE, 2023, 3 (04)
[5]   Silicone oil versus gas tamponade for giant retinal tear-associated fovea-sparing retinal detachment: a comparison of outcome [J].
Banerjee, P. J. ;
Chandra, A. ;
Petrou, P. ;
Charteris, D. G. .
EYE, 2017, 31 (09) :1302-1307
[6]   A systematic review of patient inflammatory bowel disease information resources on the world wide web [J].
Bernard, Andre ;
Langille, Morgan ;
Hughes, Stephanie ;
Rose, Caren ;
Leddin, Desmond ;
van Zanten, Sander Veldhuyzen .
AMERICAN JOURNAL OF GASTROENTEROLOGY, 2007, 102 (09) :2070-2077
[7]  
Chat GPT Google Bard AI: A Review, 2023, International Conference on IoT, Communication and Automation Technology (ICICAT)
[8]   The Role of ChatGPT, Generative Language Models, and Artificial Intelligence in Medical Education: A Conversation With ChatGPT and a Call for Papers [J].
Eysenbach, Gunther .
JMIR MEDICAL EDUCATION, 2023, 9
[9]   Priorities for successful use of artificial intelligence by public health organizations: a literature review [J].
Fisher, Stacey ;
Rosella, Laura C. .
BMC PUBLIC HEALTH, 2022, 22 (01)
[10]   Performance of Google bard and ChatGPT in mass casualty incidents triage [J].
Gan, Rick Kye ;
Ogbodo, Jude Chukwuebuka ;
Wee, Yong Zheng ;
Gan, Ann Zee ;
Gonzalez, Pedro Arcos .
AMERICAN JOURNAL OF EMERGENCY MEDICINE, 2024, 75 :72-78