Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing

被引:19
|
作者
Sumbal, Anusha [1 ]
Sumbal, Ramish [1 ]
Amir, Alina [1 ]
机构
[1] Dow Univ Hlth Sci, Baba E Urdu Rd, Karachi 74200, Pakistan
来源
JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT | 2024年 / 11卷
关键词
ChatGPT; academic performance; medical education; artificial intelligence; digital health; medicine;
D O I
10.1177/23821205241238641
中图分类号
G40 [教育学];
学科分类号
040101 ; 120403 ;
摘要
OBJECTIVE We, therefore, aim to conduct a systematic review to assess the academic potential of ChatGPT-3.5, along with its strengths and limitations when giving medical exams.METHOD Following PRISMA guidelines, a systemic search of the literature was performed using electronic databases PUBMED/MEDLINE, Google Scholar, and Cochrane. Articles from their inception till April 4, 2023, were queried. A formal narrative analysis was conducted by systematically arranging similarities and differences between individual findings together.RESULTS After rigorous screening, 12 articles underwent this review. All the selected papers assessed the academic performance of ChatGPT-3.5. One study compared the performance of ChatGPT-3.5 with the performance of ChatGPT-4 when giving a medical exam. Overall, ChatGPT performed well in 4 tests, averaged in 4 tests, and performed badly in 4 tests. ChatGPT's performance was directly proportional to the level of the questions' difficulty but was unremarkable on whether the questions were binary, descriptive, or MCQ-based. ChatGPT's explanation, reasoning, memory, and accuracy were remarkably good, whereas it failed to understand image-based questions, and lacked insight and critical thinking.CONCLUSION ChatGPT-3.5 performed satisfactorily in the exams it took as an examinee. However, there is a need for future related studies to fully explore the potential of ChatGPT in medical education.
引用
收藏
页数:12
相关论文
共 50 条
  • [21] Assessment of ChatGPT-3.5's Knowledge in Oncology: Comparative Study with ASCO-SEP Benchmarks
    Odabashian, Roupen
    Bastin, Donald
    Jones, Georden
    Manzoor, Maria
    Tangestaniapour, Sina
    Assad, Malke
    Lakhani, Sunita
    Odabashian, Maritsa
    Mcgee, Sharon
    JMIR AI, 2024, 3
  • [22] ChatGPT's performance on JS']JSA-certified anesthesiologist exam
    Kinoshita, Michiko
    Komasaka, Mizuki
    Tanaka, Katsuya
    JOURNAL OF ANESTHESIA, 2024, 38 (02) : 282 - 283
  • [23] Performance of ChatGPT in medical examinations: A systematic review and a meta-analysis
    Levin, Gabriel
    Horesh, Nir
    Brezinov, Yoav
    Meyer, Raanan
    BJOG-AN INTERNATIONAL JOURNAL OF OBSTETRICS AND GYNAECOLOGY, 2024, 131 (03) : 378 - 380
  • [24] ChatGPT-3.5 and-4.0 and mechanical engineering: Examining performance on the FE mechanical engineering and undergraduate exams
    Frenkel, Matthew E.
    Emara, Hebah
    COMPUTER APPLICATIONS IN ENGINEERING EDUCATION, 2024, 32 (06)
  • [25] Can ChatGPT be the Plastic Surgeon's New Digital Assistant? A Bibliometric Analysis and Scoping Review of ChatGPT in Plastic Surgery Literature
    Hilary Y. Liu
    Mario Alessandri-Bonetti
    José Antonio Arellano
    Francesco M. Egro
    Aesthetic Plastic Surgery, 2024, 48 : 1644 - 1652
  • [26] More human than human? Differences in lexis and collocation within academic essays produced by ChatGPT-3.5 and human L2 writers
    Zhang, Mengxuan
    Crosthwaite, Peter
    IRAL-INTERNATIONAL REVIEW OF APPLIED LINGUISTICS IN LANGUAGE TEACHING, 2025,
  • [27] ChatGPT in radiology: A systematic review of performance, pitfalls, and future perspectives
    Keshavarz, Pedram
    Bagherieh, Sara
    Nabipoorashra, Seyed Ali
    Chalian, Hamid
    Rahsepar, Amir Ali
    Kim, Grace Hyun J.
    Hassani, Cameron
    Raman, Steven S.
    Bedayat, Arash
    DIAGNOSTIC AND INTERVENTIONAL IMAGING, 2024, 105 (7-8) : 251 - 265
  • [28] Can ChatGPT be the Plastic Surgeon's New Digital Assistant? A Bibliometric Analysis and Scoping Review of ChatGPT in Plastic Surgery Literature
    Liu, Hilary Y.
    Alessandri-Bonetti, Mario
    Arellano, Jose Antonio
    Egro, Francesco M.
    AESTHETIC PLASTIC SURGERY, 2024, 48 (08) : 1644 - 1652
  • [29] Assessing ChatGPT's ability to pass the FRCS orthopaedic part A exam: A critical analysis
    Saad, Ahmed
    Iyengar, Karthikeyan P.
    Kurisunkal, Vineet
    Botchu, Rajesh
    SURGEON-JOURNAL OF THE ROYAL COLLEGES OF SURGEONS OF EDINBURGH AND IRELAND, 2023, 21 (05): : 263 - 266
  • [30] Accuracy and consistency of ChatGPT-3.5 and-4 in providing differential diagnoses in oral and maxillofacial diseases: a comparative diagnostic performance analysis
    Tomo, Saygo
    Lechien, Jerome R.
    Bueno, Hugo Sobrinho
    Cantieri-Debortoli, Daniela Filie
    Simonato, Luciana Estevam
    CLINICAL ORAL INVESTIGATIONS, 2024, 28 (10)