Accuracy and Readability of ChatGPT Responses to Patient-Centric Strabismus Questions

被引：0

作者：

Gary, Ashlyn A. ^{[1
]}

Lai, James M. ^{[1
]}

Locatelli, Elyana V. T. ^{[1
]}

Falcone, Michelle M. ^{[2
]}

Cavuoto, Kara M. ^{[2
]}

机构：

[1] Univ Miami, Miller Sch Med, Miami, FL USA

[2] Univ Miami, Miller Sch Med, Bascom Palmer Eye Inst, 900 NW 17th St, Miami, FL 33136 USA

来源：

JOURNAL OF PEDIATRIC OPHTHALMOLOGY & STRABISMUS | 2025年

关键词：

D O I：

10.3928/01913913-20250110-02

中图分类号：

R77 [眼科学];

学科分类号：

100212 ;

摘要：

Purpose: To assess the medical accuracy and readability of responses provided by ChatGPT (OpenAI), the most widely used artificial intelligence-powered chatbot, regarding questions about strabismus. Methods:Thirty-four questions were input into ChatGPT 3.5 (free version) and 4.0 (paid version) at three time intervals (day 0, 1 week, and 1 month) in two distinct geographic locations (California and Florida) in March 2024. Two pediatric ophthalmologists rated responses as "acceptable," "accurate but missing key information or minor inaccuracies," or "inaccurate and potentially harmful." The online tool, Readable, measured the Flesch-Kincaid Grade Level and Flesch Reading Ease Score to assess readability. Results: Overall, 64% of responses by ChatGPT were "acceptable;" but the proportion of "acceptable" responses differed by version (47% for ChatGPT 3.5 vs 53% for 4.0, P < .05) and state (77% of California vs 51% of Florida, P < .001). Responses in Florida were more likely to be "inaccurate and potentially harmful"compared to those in California (6.9% vs. 1.5%, P < .001). Over 1 month, the overall percentage of "acceptable" responses increased (60% at day 0, 64% at 1 week, and 67% at 1 month, P > .05), whereas "inaccurate and potentially harmful" responses decreased (5% at day 0, 5% at 1 week, and 3% at 1 month, P > .05). On average, responses scored a Flesch-Kincaid Grade Level score of 15, equating to a higher than high school grade reading level. Conclusions: Although most of ChatGPT's responses to strabismus questions were clinically acceptable, there were variations in responses across time and geographic regions. The average reading level exceeded a high school level and demonstrated low readability. Although ChatGPT demonstrates potential as a supplementary resource for parents and patients with strabismus, improving the accuracy and readability of free versions of ChatGPT may increase its utility.

引用

页数：8

共 50 条

[1] Acceptability and readability of ChatGPT-4 based responses for frequently asked questions about strabismus and amblyopia
Guven, S.
Ayyildiz, B.
JOURNAL FRANCAIS D OPHTALMOLOGIE, 2025, 48 (03):
[2] Evaluating the Accuracy and Readability of ChatGPT-4o's Responses to Patient-Based Questions about Keratoconus
Balci, Ali Safa
Cakmak, Semih
OPHTHALMIC EPIDEMIOLOGY, 2025,
[3] Presentation suitability and readability of ChatGPT's medical responses to patient questions about on knee osteoarthritis
Yoo, Myungeun
Jang, Chan Woong
HEALTH INFORMATICS JOURNAL, 2025, 31 (01)
[4] Readability, accuracy, and appropriateness of ChatGPT 4.0 responses for use in patient education materials for Condyloma acuminatum
Moosvi, Nosheen
Kovarik, Carrie
CLINICS IN DERMATOLOGY, 2024, 42 (01) : 87 - 88
[5] Evaluating ChatGPT's efficacy and readability to common pediatric ophthalmology and strabismus-related questions
Ahmed, H. Shafeeq
Thrishulamurthy, Chinmayee J.
EUROPEAN JOURNAL OF OPHTHALMOLOGY, 2025, 35 (02) : 466 - 473
[6] The patient-centric toolbox
DeAnda, Abe, Jr.
Balsam, Leora B.
JOURNAL OF THORACIC AND CARDIOVASCULAR SURGERY, 2015, 150 (05): : E67 - E68
[7] Patient-centric discourse
Miller, G
HOSPITALS & HEALTH NETWORKS, 2004, 78 (11): : 8 - 8
[8] Assessing the accuracy and utility of ChatGPT responses to patient questions regarding posterior lumbar decompression
Giakas, Alec M.
Narayanan, Rajkishen
Ezeonu, Teeto
Dalton, Jonathan
Lee, Yunsoo
Henry, Tyler
Mangan, John
Schroeder, Gregory
Vaccaro, Alexander
Kepler, Christopher
ARTIFICIAL INTELLIGENCE SURGERY, 2024, 4 (03): : 233 - 246
[9] Evaluation of the appropriateness and readability of ChatGPT responses to patient queries on uveitis
Mohammadi, Saeed
Khatri, Anadi K. C.
Jain, Tanya
Thng, Zheng Xian
Yoo, Woong-Sun
Yavari, Negin
Mobasserian, Azadeh
Bazojoo, Vahid
Akhavanrezayat, Amir
Tran, Anh Ngoc Tram
Yasar, Cigdem
Elaraby, Osama
Gupta, Ankur Sudhir
Hung, Jia-Horung
El Feky, Dalia
Nguyen, Quan Dong
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)
[10] Appropriateness and Readability of ChatGPT-3.5 Responses to Common Patient Questions on Age-Related Macular Degeneration
Challa, Nayanika
Luskey, Nina
Wang, Daniel
INVESTIGATIVE OPHTHALMOLOGY & VISUAL SCIENCE, 2024, 65 (07)

← 1 2 3 4 5 →