Artificial intelligence in pediatric ophthalmology: a comparative study of ChatGPT-4.0 and DeepSeek-R1 performance

被引：0

作者：

Karatas, Gamze ^{[1
]}

Karatas, Mehmet Egemen ^{[2
]}

机构：

[1] Prof Dr Cemil Tascioglu City Hosp, Dept Ophthalmol, Darulaceze Cad 27, Istanbul, Turkiye

[2] Sisli Hamidiye Etfal Training & Res Hosp, Dept Ophthalmol, Istanbul, Turkiye

来源：

STRABISMUS | 2025年

关键词：

Artificial intelligence; ChatGPT-4.0; DeepSeek-R1; pediatric ophthalmology; Strabismus;

D O I：

10.1080/09273972.2025.2536782

中图分类号：

R77 [眼科学];

学科分类号：

100212 ;

摘要：

Objective: This study aims to evaluate and compare the accuracy and performance of two large language models (LLMs), ChatGPT-4.0 and DeepSeek-R1, in answering pediatric ophthalmology-related questions. Methods: A total of 44 multiple-choice questions were selected, covering various subspecialties of pediatric ophthalmology. Both LLMs were tasked with answering these questions, and their responses were compared in terms of accuracy. Results: ChatGPT-4.0 correctly answered 82% of the questions, while DeepSeek-R1 achieved a higher accuracy rate of 93% (p: 0.06). In strabismus, ChatGPT-4.0 answered 70% of questions correctly, while DeepSeek-R1 achieved 82% (p: 0.50). In other subspecialties, ChatGPT-4.0 answered 89% correctly, and DeepSeek-R1 achieved 100% accuracy (p: 0.25). Conclusion: DeepSeek-R1 outperformed ChatGPT-4.0 in overall accuracy, particularly in pediatric ophthalmology. These findings suggest the need for further optimization of LLM models to enhance their performance and reliability in clinical settings, especially in pediatric ophthalmology.

引用

页数：7

共 24 条

[1]

American Board of Ophthalmology, 2024, Examination overview-ABO WQE procedures manual-1

[2] Evaluating the Performance of ChatGPT in Ophthalmology [J].

Antaki, Fares ;

Touma, Samir ;

Milad, Daniel ;

El -Khoury, Jonathan ;

Duval, Renaud .

OPHTHALMOLOGY SCIENCE, 2023, 3 (04)

[3] Auxiliary use of ChatGPT in surgical diagnosis and treatment [J].

Au, Kahei ;

Yang, Wah .

INTERNATIONAL JOURNAL OF SURGERY, 2023, 109 (12) :3940-3943

[4] Performance of Chatgpt in ophthalmology exam; human versus AI [J].

Balci, Ali Safa ;

Yazar, Zeliha ;

Ozturk, Banu Turgut ;

Altan, Cigdem .

INTERNATIONAL OPHTHALMOLOGY, 2024, 44 (01)

[5] Performance of Google's Artificial Intelligence Chatbot "Bard" (Now "Gemini") on Ophthalmology Board Exam Practice Questions [J].

Botross, Monica ;

Mohammadi, Seyed Omid ;

Montgomery, Kendall ;

Crawford, Courtney .

CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (03)

[6]

Brown TB, 2020, ADV NEUR IN, V33

[7] Performance of Generative Large Language Models on Ophthalmology Board-Style Questions [J].

Cai, Louis Z. ;

Shaheen, Abdulla ;

Jin, Andrew ;

Fukui, Riya ;

Yi, Jonathan S. ;

Yannuzzi, Nicolas ;

Alabiad, Chrisfouad .

AMERICAN JOURNAL OF OPHTHALMOLOGY, 2023, 254 :141-149

[8] The Breakthrough of Large Language Models Release for Medical Applications: 1-Year Timeline and Perspectives [J].

Cascella, Marco ;

Semeraro, Federico ;

Montomoli, Jonathan ;

Bellini, Valentina ;

Piazza, Ornella ;

Bignami, Elena .

JOURNAL OF MEDICAL SYSTEMS, 2024, 48 (01)

[9] ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health [J].

De Angelis, Luigi ;

Baglivo, Francesco ;

Arzilli, Guglielmo ;

Privitera, Gaetano Pierpaolo ;

Ferragina, Paolo ;

Tozzi, Alberto Eugenio ;

Rizzo, Caterina .

FRONTIERS IN PUBLIC HEALTH, 2023, 11

[10]

Gill Gurnoor S, 2024, Cureus, V16, pe73812, DOI [10.7759/cureus.73812, 10.7759/cureus.73812]

← 1 2 3 →