Evaluating the strengths and limitations of multimodal ChatGPT-4 in detecting glaucoma using fundus images

被引:5
作者
Alryalat, Saif Aldeen [1 ,2 ]
Musleh, Ayman Mohammed [3 ]
Kahook, Malik Y. [4 ]
机构
[1] Univ Jordan, Dept Ophthalmol, Amman, Jordan
[2] Houston Methodist Hosp, Dept Ophthalmol, Houston, TX USA
[3] Jordan Univ Hosp, Amman, Jordan
[4] Univ Colorado, Sch Med, Sue Anschutz Rodgers Eye Ctr, Dept Ophthalmol, Aurora, CO USA
来源
FRONTIERS IN OPHTHALMOLOGY | 2024年 / 4卷
基金
英国科研创新办公室;
关键词
large language models; glaucoma; artificial intelligence; ChatGPT; GPT;
D O I
10.3389/fopht.2024.1387190
中图分类号
R77 [眼科学];
学科分类号
100212 ;
摘要
Overview This study evaluates the diagnostic accuracy of a multimodal large language model (LLM), ChatGPT-4, in recognizing glaucoma using color fundus photographs (CFPs) with a benchmark dataset and without prior training or fine tuning.Methods The publicly accessible Retinal Fundus Glaucoma Challenge "REFUGE" dataset was utilized for analyses. The input data consisted of the entire 400 image testing set. The task involved classifying fundus images into either 'Likely Glaucomatous' or 'Likely Non-Glaucomatous'. We constructed a confusion matrix to visualize the results of predictions from ChatGPT-4, focusing on accuracy of binary classifications (glaucoma vs non-glaucoma).Results ChatGPT-4 demonstrated an accuracy of 90% with a 95% confidence interval (CI) of 87.06%-92.94%. The sensitivity was found to be 50% (95% CI: 34.51%-65.49%), while the specificity was 94.44% (95% CI: 92.08%-96.81%). The precision was recorded at 50% (95% CI: 34.51%-65.49%), and the F1 Score was 0.50.Conclusion ChatGPT-4 achieved relatively high diagnostic accuracy without prior fine tuning on CFPs. Considering the scarcity of data in specialized medical fields, including ophthalmology, the use of advanced AI techniques, such as LLMs, might require less data for training compared to other forms of AI with potential savings in time and financial resources. It may also pave the way for the development of innovative tools to support specialized medical care, particularly those dependent on multimodal data for diagnosis and follow-up, irrespective of resource constraints.
引用
收藏
页数:6
相关论文
共 18 条
  • [1] Artificial Hallucinations in ChatGPT: Implications in Scientific Writing
    Alkaissi, Hussam
    McFarlane, Samy I.
    [J]. CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (02)
  • [2] Artificial Intelligence and Glaucoma: Going Back to Basics
    AlRyalat, Saif Aldeen
    Singh, Praveer
    Kalpathy-Cramer, Jayashree
    Kahook, Malik Y.
    [J]. CLINICAL OPHTHALMOLOGY, 2023, 17 : 1525 - 1530
  • [3] Machine learning in glaucoma: a bibliometric analysis comparing computer science and medical fields' research
    AlRyalat, Saif Aldeen
    Al-Ryalat, Nosaiba
    Ryalat, Soukaina
    [J]. EXPERT REVIEW OF OPHTHALMOLOGY, 2021, 16 (06) : 511 - 515
  • [4] Performance of Generative Large Language Models on Ophthalmology Board-Style Questions
    Cai, Louis Z.
    Shaheen, Abdulla
    Jin, Andrew
    Fukui, Riya
    Yi, Jonathan S.
    Yannuzzi, Nicolas
    Alabiad, Chrisfouad
    [J]. AMERICAN JOURNAL OF OPHTHALMOLOGY, 2023, 254 : 141 - 149
  • [5] Diagnostic Accuracy of Artificial Intelligence in Glaucoma Screening and Clinical Practice
    Chaurasia, Abadh K.
    Greatbatch, Connor J.
    Hewitt, Alex W.
    [J]. JOURNAL OF GLAUCOMA, 2022, 31 (05) : 285 - 299
  • [6] The Use of ChatGPT to Assist in Diagnosing Glaucoma Based on Clinical Case Reports
    Delsoz, Mohammad
    Raja, Hina
    Madadi, Yeganeh
    Tang, Anthony A.
    Wirostko, Barbara M.
    Kahook, Malik Y.
    Yousefi, Siamak
    [J]. OPHTHALMOLOGY AND THERAPY, 2023, 12 (06) : 3121 - 3132
  • [7] CNN with Multiple Inputs for Automatic Glaucoma Assessment Using Fundus Images
    Elmoufidi, Abdelali
    Skouta, Ayoub
    Jai-Andaloussi, Said
    Ouchetto, Ouail
    [J]. INTERNATIONAL JOURNAL OF IMAGE AND GRAPHICS, 2023, 23 (01)
  • [8] A Novel Context Aware Joint Segmentation and Classification Framework for Glaucoma Detection
    Ganesh, S. Sankar
    Kannayeram, G.
    Karthick, Alagar
    Muhibbullah, M.
    [J]. COMPUTATIONAL AND MATHEMATICAL METHODS IN MEDICINE, 2021, 2021
  • [9] Survey of Hallucination in Natural Language Generation
    Ji, Ziwei
    Lee, Nayeon
    Frieske, Rita
    Yu, Tiezheng
    Su, Dan
    Xu, Yan
    Ishii, Etsuko
    Bang, Ye Jin
    Madotto, Andrea
    Fung, Pascale
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (12)
  • [10] History of artificial intelligence in medicine
    Kaul, Vivek
    Enslin, Sarah
    Gross, Seth A.
    [J]. GASTROINTESTINAL ENDOSCOPY, 2020, 92 (04) : 807 - 812