Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing

被引:19
|
作者
Sumbal, Anusha [1 ]
Sumbal, Ramish [1 ]
Amir, Alina [1 ]
机构
[1] Dow Univ Hlth Sci, Baba E Urdu Rd, Karachi 74200, Pakistan
来源
JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT | 2024年 / 11卷
关键词
ChatGPT; academic performance; medical education; artificial intelligence; digital health; medicine;
D O I
10.1177/23821205241238641
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
OBJECTIVE We, therefore, aim to conduct a systematic review to assess the academic potential of ChatGPT-3.5, along with its strengths and limitations when giving medical exams.METHOD Following PRISMA guidelines, a systemic search of the literature was performed using electronic databases PUBMED/MEDLINE, Google Scholar, and Cochrane. Articles from their inception till April 4, 2023, were queried. A formal narrative analysis was conducted by systematically arranging similarities and differences between individual findings together.RESULTS After rigorous screening, 12 articles underwent this review. All the selected papers assessed the academic performance of ChatGPT-3.5. One study compared the performance of ChatGPT-3.5 with the performance of ChatGPT-4 when giving a medical exam. Overall, ChatGPT performed well in 4 tests, averaged in 4 tests, and performed badly in 4 tests. ChatGPT's performance was directly proportional to the level of the questions' difficulty but was unremarkable on whether the questions were binary, descriptive, or MCQ-based. ChatGPT's explanation, reasoning, memory, and accuracy were remarkably good, whereas it failed to understand image-based questions, and lacked insight and critical thinking.CONCLUSION ChatGPT-3.5 performed satisfactorily in the exams it took as an examinee. However, there is a need for future related studies to fully explore the potential of ChatGPT in medical education.
引用
收藏
页数:12
相关论文
共 50 条
  • [31] A systematic review and meta-analysis on ChatGPT and its utilization in medical and dental research
    Bagde, Hiroj
    Dhopte, Ashwini
    Alam, Mohammad Khursheed
    Basri, Rehana
    HELIYON, 2023, 9 (12)
  • [32] Below average ChatGPT performance in medical microbiology exam compared to university students
    Sallam, Malik
    Al-Salahat, Khaled
    FRONTIERS IN EDUCATION, 2023, 8
  • [33] Exploring the Performance of ChatGPT Versions 3.5, 4, and 4 With Vision in the Chilean Medical Licensing Examination: Observational Study
    Rojas, Marcos
    Rojas, Marcelo
    Burgess, Valentina
    Toro-Perez, Javier
    Salehi, Shima
    JMIR MEDICAL EDUCATION, 2024, 10
  • [34] Performance of ChatGPT Across Different Versions in MedicalLicensing Examinations Worldwide:Systematic Review andMeta-Analysis
    Liu, Mingxin
    Okuhara, Tsuyoshi
    Chang, XinYi
    Shirabe, Ritsuko
    Nishiie, Yuriko
    Okada, Hiroko
    Kiuchi, Takahiro
    JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26
  • [35] ChatGPT's Performance in Spinal Metastasis Cases-Can We Discuss Our Complex Cases with ChatGPT?
    Heisinger, Stephan
    Salzmann, Stephan N.
    Senker, Wolfgang
    Aspalter, Stefan
    Oberndorfer, Johannes
    Matzner, Michael P.
    Stienen, Martin N.
    Motov, Stefan
    Huber, Dominikus
    Grohs, Josef Georg
    JOURNAL OF CLINICAL MEDICINE, 2024, 13 (24)
  • [36] Sailing the Seven Seas: A Multinational Comparison of ChatGPT's Performance on Medical Licensing Examinations
    Alfertshofer, Michael
    Hoch, Cosima C.
    Funk, Paul F.
    Hollmann, Katharina
    Wollenberg, Barbara
    Knoedler, Samuel
    Knoedler, Leonard
    ANNALS OF BIOMEDICAL ENGINEERING, 2024, 52 (06) : 1542 - 1545
  • [37] ChatGPT in medicine: A cross-disciplinary systematic review of ChatGPT's (artificial intelligence) role in research, clinical practice, education, and patient interaction
    Fatima, Afia
    Shafique, Muhammad Ashir
    Alam, Khadija
    Ahmed, Tagwa Kalool Fadlalla
    Mustafa, Muhammad Saqlain
    MEDICINE, 2024, 103 (32)
  • [38] Sailing the Seven Seas: A Multinational Comparison of ChatGPT’s Performance on Medical Licensing Examinations
    Michael Alfertshofer
    Cosima C. Hoch
    Paul F. Funk
    Katharina Hollmann
    Barbara Wollenberg
    Samuel Knoedler
    Leonard Knoedler
    Annals of Biomedical Engineering, 2024, 52 : 1542 - 1545
  • [39] Overview of Early ChatGPT?s Presence in Medical Literature: Insights From a Hybrid Literature Review by ChatGPT and Human Experts
    Temsah, Omar
    Khan, Samina A.
    Chaiah, Yazan
    Senjab, Abdulrahman
    Alhasan, Khalid
    Jamal, Amr
    Aljamaan, Fadi
    Malki, Khalid H.
    Halwani, Rabih
    Al-Tawfiq, Jaffar A.
    Temsah, Mohamad-Hani
    Al-Eyadhy, Ayman
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (04)
  • [40] Evaluation of ChatGPT-generated medical responses: A systematic review and meta-analysis
    Wei, Qiuhong
    Yao, Zhengxiong
    Cui, Ying
    Wei, Bo
    Jin, Zhezhen
    Xu, Ximing
    JOURNAL OF BIOMEDICAL INFORMATICS, 2024, 151