Can ChatGPT-3.5 Pass a Medical Exam? A Systematic Review of ChatGPT's Performance in Academic Testing

被引：35

作者：

Sumbal, Anusha ^{[1
]}

Sumbal, Ramish ^{[1
]}

Amir, Alina ^{[1
]}

机构：

[1] Dow Univ Hlth Sci, Baba E Urdu Rd, Karachi 74200, Pakistan

来源：

JOURNAL OF MEDICAL EDUCATION AND CURRICULAR DEVELOPMENT | 2024年 / 11卷

关键词：

ChatGPT; academic performance; medical education; artificial intelligence; digital health; medicine;

D O I：

10.1177/23821205241238641

中图分类号：

G40 [教育学];

学科分类号：

040101 ; 120403 ;

摘要：

OBJECTIVE We, therefore, aim to conduct a systematic review to assess the academic potential of ChatGPT-3.5, along with its strengths and limitations when giving medical exams.METHOD Following PRISMA guidelines, a systemic search of the literature was performed using electronic databases PUBMED/MEDLINE, Google Scholar, and Cochrane. Articles from their inception till April 4, 2023, were queried. A formal narrative analysis was conducted by systematically arranging similarities and differences between individual findings together.RESULTS After rigorous screening, 12 articles underwent this review. All the selected papers assessed the academic performance of ChatGPT-3.5. One study compared the performance of ChatGPT-3.5 with the performance of ChatGPT-4 when giving a medical exam. Overall, ChatGPT performed well in 4 tests, averaged in 4 tests, and performed badly in 4 tests. ChatGPT's performance was directly proportional to the level of the questions' difficulty but was unremarkable on whether the questions were binary, descriptive, or MCQ-based. ChatGPT's explanation, reasoning, memory, and accuracy were remarkably good, whereas it failed to understand image-based questions, and lacked insight and critical thinking.CONCLUSION ChatGPT-3.5 performed satisfactorily in the exams it took as an examinee. However, there is a need for future related studies to fully explore the potential of ChatGPT in medical education.

引用

页数：12

共 29 条

[1] Performance of ChatGPT and GPT-4 on Neurosurgery Written Board Examinations [J].

Ali, Rohaid ;

Tang, Oliver Y. ;

Connolly, Ian D. ;

Sullivan, Patricia L. Zadnik ;

Shin, John H. ;

Fridley, Jared S. ;

Asaad, Wael F. ;

Cielo, Deus ;

Oyelese, Adetokunbo A. ;

Doberstein, Curtis E. ;

Gokaslan, Ziya L. ;

Telfeian, Albert E. .

NEUROSURGERY, 2023, 93 (06) :1353-1365

[2] Evaluating the Performance of ChatGPT in Ophthalmology [J].

Antaki, Fares ;

Touma, Samir ;

Milad, Daniel ;

El -Khoury, Jonathan ;

Duval, Renaud .

OPHTHALMOLOGY SCIENCE, 2023, 3 (04)

[3] ChatGPT: five priorities for research [J].

Bockting, Claudi ;

van Dis, Eva A. M. ;

Bollen, Johan ;

van Rooij, Robert ;

Zuidema, Willem L. .

NATURE, 2023, 614 (7947) :224-226

[4]

Duong D, 2024, EUR J HUM GENET, V32, P466, DOI 10.1038/s41431-023-01396-8

[5] Can ChatGPT pass the life support exams without entering the American heart association course? [J].

Fijaoko, Nino ;

Gosak, Lucija ;

Stiglic, Gregor ;

Picard, Christopher T. ;

Douma, Matthew John .

RESUSCITATION, 2023, 185

[6]

Gilson A., 2023, JMIR MED EDUC, V9

[7] The exciting potential for ChatGPT in obstetrics and gynecology [J].

Grunebaum, Amos ;

Chervenak, Joseph ;

Pollet, Susan L. ;

Katz, Adi ;

Chervenak, Frank A. .

AMERICAN JOURNAL OF OBSTETRICS AND GYNECOLOGY, 2023, 228 (06) :696-705

[8]

Hou W., 2023, BIORXIV

[9] Are ChatGPT's knowledge and interpretation ability comparable to those of medical students in Korea for taking a parasitology examination?: a descriptive study [J].

Huh, Sun .

JOURNAL OF EDUCATIONAL EVALUATION FOR HEALTH PROFESSIONS, 2023, 20

[10]

Johnson Douglas, 2023, Res Sq, DOI 10.21203/rs.3.rs-2566942/v1

← 1 2 3 →