The intent of ChatGPT usage and its robustness in medical proficiency exams: a systematic review

被引:0
|
作者
Tatiana Chaiban [1 ]
Zeinab Nahle [2 ]
Ghaith Assi [2 ]
Michelle Cherfane [2 ]
机构
[1] Department of Social and Education Sciences, School of Arts and Sciences, Lebanese American University, Beirut
[2] Gilbert and Rose-Marie Chagoury School of Medicine, Lebanese American University, P.O. Box 36, Byblos
[3] INSPECT-LB (Institut National de Santé Publique, d’Épidémiologie Clinique Et de Toxicologie-Liban), Beirut
来源
Discover Education | / 3卷 / 1期
关键词
ChatGPT; Subspecialties; Written medical examinations;
D O I
10.1007/s44217-024-00332-2
中图分类号
学科分类号
摘要
Background: Since it was first launched, ChatGPT, a Large Language Model (LLM), has been widely used across different disciplines, particularly the medical field. Objective: The main aim of this review is to thoroughly assess the performance of the distinct version of ChatGPT in subspecialty written medical proficiency exams and the factors that impact it. Methods: Distinct online databases were searched for appropriate articles that fit the intended objectives of the study: PubMed, CINAHL, and Web of Science. A group of reviewers was assembled to create an appropriate methodology framework for the articles to be included. Results: 16 articles were adopted for this review that assessed the performance of different ChatGPT versions across different subspecialty written examinations, such as surgery, neurology, orthopedics, trauma and orthopedics, core cardiology, family medicine, and dermatology. The studies reported different passing grades and rankings with distinct accuracy rates, ranging from 35.8% to 91%, across different datasets and subspecialties. Some of the factors that were highlighted as impacting its correctness were the following: (1) ChatGPT distinct versions; (2) medical subspecialties; (3) types of questions; (4) language; and (5) comparators. Conclusions: This review indicates ChatGPT’s performance on the different medical specialty examinations and poses potential research to investigate whether ChatGPT can enhance the learning and support medical students taking a range of medical specialty exams. However, to avoid exploitation and any detrimental effects on the real world of medicine, it is crucial to be aware of its limitations and improve the ongoing evaluation of this AI tool. © The Author(s) 2024.
引用
收藏
相关论文
共 6 条
  • [1] Systematic review of ChatGPT accuracy and performance in Iran's medical licensing exams: A brief report
    Keshtkar, Alireza
    Atighi, Farnaz
    Reihani, Hamid
    JOURNAL OF EDUCATION AND HEALTH PROMOTION, 2024, 13 (01)
  • [2] A systematic review and meta-analysis on ChatGPT and its utilization in medical and dental research
    Bagde, Hiroj
    Dhopte, Ashwini
    Alam, Mohammad Khursheed
    Basri, Rehana
    HELIYON, 2023, 9 (12)
  • [3] Performance of ChatGPT in medical examinations: A systematic review and a meta-analysis
    Levin, Gabriel
    Horesh, Nir
    Brezinov, Yoav
    Meyer, Raanan
    BJOG-AN INTERNATIONAL JOURNAL OF OBSTETRICS AND GYNAECOLOGY, 2024, 131 (03) : 378 - 380
  • [4] Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing
    Sumbal, Anusha
    Sumbal, Ramish
    Amir, Alina
    JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT, 2024, 11
  • [5] Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis
    Wei, Qiuhong
    Yao, Zhengxiong
    Cui, Ying
    Wei, Bo
    Jin, Zhezhen
    Xu, Ximing
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 151
  • [6] ChatGPT integration within nursing education and its implications for nursing students: A systematic review and text network analysis
    Gunawan, Joko
    Aungsuroch, Yupin
    Montayre, Jed
    NURSE EDUCATION TODAY, 2024, 141