Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing

被引:19
|
作者
Sumbal, Anusha [1 ]
Sumbal, Ramish [1 ]
Amir, Alina [1 ]
机构
[1] Dow Univ Hlth Sci, Baba E Urdu Rd, Karachi 74200, Pakistan
来源
JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT | 2024年 / 11卷
关键词
ChatGPT; academic performance; medical education; artificial intelligence; digital health; medicine;
D O I
10.1177/23821205241238641
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
OBJECTIVE We, therefore, aim to conduct a systematic review to assess the academic potential of ChatGPT-3.5, along with its strengths and limitations when giving medical exams.METHOD Following PRISMA guidelines, a systemic search of the literature was performed using electronic databases PUBMED/MEDLINE, Google Scholar, and Cochrane. Articles from their inception till April 4, 2023, were queried. A formal narrative analysis was conducted by systematically arranging similarities and differences between individual findings together.RESULTS After rigorous screening, 12 articles underwent this review. All the selected papers assessed the academic performance of ChatGPT-3.5. One study compared the performance of ChatGPT-3.5 with the performance of ChatGPT-4 when giving a medical exam. Overall, ChatGPT performed well in 4 tests, averaged in 4 tests, and performed badly in 4 tests. ChatGPT's performance was directly proportional to the level of the questions' difficulty but was unremarkable on whether the questions were binary, descriptive, or MCQ-based. ChatGPT's explanation, reasoning, memory, and accuracy were remarkably good, whereas it failed to understand image-based questions, and lacked insight and critical thinking.CONCLUSION ChatGPT-3.5 performed satisfactorily in the exams it took as an examinee. However, there is a need for future related studies to fully explore the potential of ChatGPT in medical education.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Methodological insights into ChatGPT’s screening performance in systematic reviews
    Mahbod Issaiy
    Hossein Ghanaati
    Shahriar Kolahi
    Madjid Shakiba
    Amir Hossein Jalali
    Diana Zarei
    Sina Kazemian
    Mahsa Alborzi Avanaki
    Kavous Firouznia
    BMC Medical Research Methodology, 24
  • [42] Assessing ChatGPT’s orthopedic in-service training exam performance and applicability in the field
    Neil Jain
    Caleb Gottlich
    John Fisher
    Dominic Campano
    Travis Winston
    Journal of Orthopaedic Surgery and Research, 19
  • [43] Assessing ChatGPT's orthopedic in-service training exam performance and applicability in the field
    Jain, Neil
    Gottlich, Caleb
    Fisher, John
    Campano, Dominic
    Winston, Travis
    JOURNAL OF ORTHOPAEDIC SURGERY AND RESEARCH, 2024, 19 (01)
  • [44] Methodological insights into ChatGPT's screening performance in systematic reviews
    Issaiy, Mahbod
    Ghanaati, Hossein
    Kolahi, Shahriar
    Shakiba, Madjid
    Jalali, Amir Hossein
    Zarei, Diana
    Kazemian, Sina
    Avanaki, Mahsa Alborzi
    Firouznia, Kavous
    BMC MEDICAL RESEARCH METHODOLOGY, 2024, 24 (01)
  • [45] Transforming education with AI: A systematic review of ChatGPT's role in learning, academic practices, and institutional adoption
    Salih, Sayeed
    Husain, Omayma
    Hamdan, Mosab
    Abdelsalam, Samah
    Elshafie, Hashim
    Motwakel, Abdelwahed
    RESULTS IN ENGINEERING, 2025, 25
  • [46] The intent of ChatGPT usage and its robustness in medical proficiency exams: a systematic review
    Tatiana Chaiban
    Zeinab Nahle
    Ghaith Assi
    Michelle Cherfane
    Discover Education, 3 (1):
  • [47] From GPT-3.5 to GPT-4.o: A Leap in AI's Medical Exam Performance
    Kipp, Markus
    INFORMATION, 2024, 15 (09)
  • [48] Evaluating ChatGPT-4 in medical education: an assessment of subject exam performance reveals limitations in clinical curriculum support for students
    Mackey B.P.
    Garabet R.
    Maule L.
    Tadesse A.
    Cross J.
    Weingarten M.
    Discover Artificial Intelligence, 2024, 4 (01):
  • [49] Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT
    Dashti, Mahmood
    Ghasemi, Shohreh
    Ghadimi, Niloofar
    Hefzi, Delband
    Karimian, Azizeh
    Zare, Niusha
    Fahimipour, Amir
    Khurshid, Zohaib
    Chafjiri, Maryam Mohammadalizadeh
    Ghaedsharaf, Sahar
    IMAGING SCIENCE IN DENTISTRY, 2024, 54 (03) : 271 - 275
  • [50] Can ChatGPT-4o provide new systematic review ideas to oral and maxillofacial surgeons?
    Balel, Yunus
    Zogo, Atakan
    Yildiz, Serkan
    Tanyeri, Hakki
    JOURNAL OF STOMATOLOGY ORAL AND MAXILLOFACIAL SURGERY, 2024, 125 (05)