Evaluating the Accuracy and Readability of ChatGPT-4o's Responses to Patient-Based Questions about Keratoconus

被引：0

作者：

Balci, Ali Safa ^{[1
]}

Cakmak, Semih ^{[2
]}

机构：

[1] Univ Hlth Sci, Sehit Prof Dr Ilhan Varank Sancaktepe Training & R, Dept Ophthalmol, Istanbul, Turkiye

[2] Istanbul Univ, Istanbul Fac Med, Dept Ophthalmol, Istanbul, Turkiye

来源：

OPHTHALMIC EPIDEMIOLOGY | 2025年

关键词：

ChatGPT-4o; healthcare; keratoconus; large language models; readability;

D O I：

10.1080/09286586.2025.2484760

中图分类号：

R77 [眼科学];

学科分类号：

100212 ;

摘要：

Purpose: This study aimed to evaluate the accuracy and readability of responses generated by ChatGPT-4o, an advanced large language model, to frequently asked patient-centered questions about keratoconus. Methods: A cross-sectional, observational study was conducted using ChatGPT-4o to answer 30 potential questions that could be asked by patients with keratoconus. The accuracy of the responses was evaluated by two board-certified ophthalmologists and scored on a scale of 1 to 5. Readability was assessed using the Simple Measure of Gobbledygook (SMOG), Flesch-Kincaid Grade Level (FKGL), and Flesch Reading Ease (FRE) scores. Descriptive, treatment-related, and follow-up-related questions were analyzed, and statistical comparisons between these categories were performed. Results: The mean accuracy score for the responses was 4.48 +/- 0.57 on a 5-point Likert scale. The interrater reliability, with an intraclass correlation coefficient of 0.769, indicated a strong level of agreement. Readability scores revealed a SMOG score of 15.49 +/- 1.74, an FKGL score of 14.95 +/- 1.95, and an FRE score of 27.41 +/- 9.71, indicating that a high level of education is required to comprehend the responses. There was no significant difference in accuracy among the different question categories (p = 0.161), but readability varied significantly, with treatment-related questions being the easiest to understand. Conclusion: ChatGPT-4o provides highly accurate responses to patient-centered questions about keratoconus, though the complexity of its language may limit accessibility for the general population. Further development is needed to enhance the readability of AI-generated medical content.

引用

页数：6

共 38 条

[1] Readership Awareness Series - Paper 4: Chatbots and ChatGPT - Ethical Considerations in Scientific Publications [J].

Ali, Mohammad Javed ;

Djalilian, Ali .

SEMINARS IN OPHTHALMOLOGY, 2023, 38 (05) :403-404

[2] ChatGPT and retinal disease: a cross-sectional study on AI comprehension of clinical guidelines [J].

Balas, Michael ;

Mandelcorn, Efrem D. ;

Yan, Peng ;

Ing, Edsel B. ;

Crawford, Sean A. ;

Arjmand, Parnian .

CANADIAN JOURNAL OF OPHTHALMOLOGY-JOURNAL CANADIEN D OPHTALMOLOGIE, 2025, 60 (01) :e117-e123

[3] Evaluating ChatGPT on Orbital and Oculofacial Disorders: Accuracy and Readability Insights [J].

Balas, Michael ;

Janic, Ana ;

Daigle, Patrick ;

Nijhawan, Navdeep ;

Hussain, Ahsen ;

Gill, Harmeet ;

Lahaie, Gabriela L. ;

Belliveau, Michel J. ;

Crawford, Sean A. ;

Parnian, Arjmand ;

Ing, Edsel B. .

OPHTHALMIC PLASTIC AND RECONSTRUCTIVE SURGERY, 2024, 40 (02) :217-222

[4] Quality and Agreement With Scientific Consensus of ChatGPT Information Regarding Corneal Transplantation and Fuchs Dystrophy [J].

Barclay, Kayson S. ;

You, Jane Y. ;

Coleman, Michael J. ;

Mathews, Priya M. ;

Ray, Vincent L. ;

Riaz, Kamran M. ;

De Rojas, Joaquin O. ;

Wang, Aaron S. ;

Watson, Shelly H. ;

Koo, Ellen H. ;

Eghrari, Allen O. .

CORNEA, 2024, 43 (06) :746-750

[5] Evaluating the performance of ChatGPT in answering questions related to urolithiasis [J].

Cakir, Hakan ;

Caglar, Ufuk ;

Yildiz, Oguzhan ;

Meric, Arda ;

Ayranci, Ali ;

Ozgor, Faruk .

INTERNATIONAL UROLOGY AND NEPHROLOGY, 2024, 56 (01) :17-21

[6] Complications of accelerated corneal collagen cross-linking: review of 2025 eyes [J].

Cakmak, Semih ;

Sucu, Mehmet Emin ;

Yildirim, Yusuf ;

Kepez Yildiz, Burcin ;

Kirgiz, Ahmet ;

Bektasoglu, Damla Leman ;

Demirok, Ahmet .

INTERNATIONAL OPHTHALMOLOGY, 2020, 40 (12) :3269-3277

[7] Developing prompts from large language model for extracting clinical information from pathology and ultrasound reports in breast cancer [J].

Choi, Hyeon Seok ;

Song, Jun Yeong ;

Shin, Kyung Hwan ;

Chang, Ji Hyun ;

Jang, Bum-Sup .

RADIATION ONCOLOGY JOURNAL, 2023, 41 (03) :209-216

[8] Performance of ChatGPT in Diagnosis of Corneal Eye Diseases [J].

Delsoz, Mohammad ;

Madadi, Yeganeh ;

Raja, Hina ;

Munir, Wuqaas M. ;

Tamm, Brendan ;

Mehravaran, Shiva ;

Soleimani, Mohammad ;

Djalilian, Ali ;

Yousefi, Siamak .

CORNEA, 2024, 43 (05) :664-670

[9]

Deng J, 2025, OPHTHAL EPIDEMIOL, V32, P245, DOI [10.1080/09286586.2024.2373956, 10.1109/OCEANS55160.2024.10754047]

[10] Literacy and health outcomes - A systematic review of the literature [J].

DeWalt, DA ;

Berkman, ND ;

Sheridan, S ;

Lohr, KN ;

Pignone, MP .

JOURNAL OF GENERAL INTERNAL MEDICINE, 2004, 19 (12) :1228-1239

← 1 2 3 4 →