Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT

被引：5

作者：

Dashti, Mahmood ^{[1
]}

Ghasemi, Shohreh ^{[2
]}

Ghadimi, Niloofar ^{[3
]}

Hefzi, Delband ^{[4
]}

Karimian, Azizeh ^{[5
]}

Zare, Niusha ^{[6
]}

Fahimipour, Amir ^{[7
]}

Khurshid, Zohaib ^{[8
]}

Chafjiri, Maryam Mohammadalizadeh ^{[9
]}

Ghaedsharaf, Sahar ^{[10
]}

机构：

[1] Shahid Beheshti Univ Med Sci, Res Inst Dent Sci, Dentofacial Deform Res Ctr, Tehran, Iran

[2] Queen Mary Coll, Dept Trauma & Craniofacial Reconstruct, London, England

[3] Islamic Azad Univ Med Sci, Dent Sch, Dept Oral & Maxillofacial Radiol, Tehran, Iran

[4] Univ Tehran Med Sci, Sch Dent, Tehran, Iran

[5] Golestan Univ Med Sci, Dent Res Ctr, Dept Biostat, Gorgan, Iran

[6] Univ Southern Calif, Dept Operat Dent, Los Angeles, CA USA

[7] Univ Sydney, Discipline Oral Surg Med & Diagnost, Sch Dent, Fac Med & Hlth,Westmead Ctr Oral Hlth, Sydney, Australia

[8] King Faisal Univ, Dept Prosthodont & Dent Implantol, Al Hasa, Saudi Arabia

[9] Shahid Beheshti Univ Med Sci, Sch Dent, Dept Oral & Maxillofacial Pathol, Tehran, Iran

[10] Shahid Beheshti Univ Med Sci, Sch Dent, Dept Oral & Maxillofacial Radiol, Tehran, Iran

来源：

IMAGING SCIENCE IN DENTISTRY | 2024年 / 54卷 / 03期

关键词：

Artificial Intelligence; Deep Learning; Dentistry; Education; Dental;

D O I：

10.5624/isd.20240037

中图分类号：

R78 [口腔科学];

学科分类号：

1003 ;

摘要：

Purpose: Recent advancements in artificial intelligence (AI), particularly tools such as ChatGPT developed by OpenAI, a U.S.-based AI research organization, have transformed the healthcare and education sectors. This study investigated the effectiveness of ChatGPT in answering dentistry exam questions, demonstrating its potential to enhance professional practice and patient care. Materials and Methods: This study assessed the performance of ChatGPT 3.5 and 4 on U.S. dental exams specifically, the Integrated National Board Dental Examination (INBDE), Dental Admission Test (DAT), and ChatGPT's answers were evaluated against official answer sheets. Results: ChatGPT 3.5 and 4 were tested with 253 questions from the INBDE, ADAT, and DAT exams. For the INBDE, both versions achieved 80% accuracy in knowledge-based questions and 66-69% in case history questions. In ADAT, they scored 66-83% in knowledge-based and 76% in case history questions. ChatGPT 4 excelled on the DAT, with 94% accuracy in knowledge-based questions, 57% in mathematical analysis items, and 100% in comprehension questions, surpassing ChatGPT 3.5's rates of 83%, 31%, and 82%, respectively. The difference was significant for knowledge-based questions (P= 0.009). Both versions showed similar patterns in incorrect responses. Conclusion: Both ChatGPT 3.5 and 4 effectively handled knowledge-based, case history, and comprehension questions, with ChatGPT 4 being more reliable and surpassing the performance of 3.5. ChatGPT 4's perfect score in comprehension questions underscores its trainability in specific subjects. However, both versions exhibited weaker performance in mathematical analysis, suggesting this as an area for improvement.

引用

页码：271 / 275

页数：5

共 36 条

[31] Evaluating ChatGPT-4's Performance in Identifying Radiological Anatomy in FRCR Part 1 Examination Questions
Sarangi, Pradosh Kumar
Datta, Suvrankar
Panda, Braja Behari
Panda, Swaha
Mondal, Himel
INDIAN JOURNAL OF RADIOLOGY AND IMAGING, 2024,
[32] From GPT-3.5 to GPT-4.o: A Leap in AI's Medical Exam Performance
Kipp, Markus
INFORMATION, 2024, 15 (09)
[33] How does instructional leadership influence opportunity to learn in mathematics? A comparative study of pathways for grade 4 students in the U.S. and Belgium
Urick, Angela M.
Ford, Timothy G.
Page Wilson, Alison S.
Consuegra, Els
RESEARCH IN COMPARATIVE AND INTERNATIONAL EDUCATION, 2022, 17 (03): : 372 - 398
[34] Evaluating generative AI tools in Arabic academic libraries: a comparative analysis of ChatGPT-4o and Gemini's performance and practical queries
Ammar, Abdelrahman Saber Abdelrahman
Shehata, Ahmed
Eldakar, Metwaly
GLOBAL KNOWLEDGE MEMORY AND COMMUNICATION, 2025,
[35] Evaluation of ChatGPT-4's Performance in Therapeutic Decision-Making During Multidisciplinary Oncology Meetings for Head and Neck Squamous Cell Carcinoma
Alami, Kenza
Willemse, Esther
Quiriny, Marie
Lipski, Samuel
Laurent, Celine
Donquier, Vincent
Digonnet, Antoine
CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (09)
[36] Comparative analysis of GPT-4-based ChatGPT's diagnostic performance with radiologists using real-world radiology reports of brain tumors
Mitsuyama, Yasuhito
Tatekawa, Hiroyuki
Takita, Hirotaka
Sasaki, Fumi
Tashiro, Akane
Oue, Satoshi
Walston, Shannon L.
Nonomiya, Yuta
Shintani, Ayumi
Miki, Yukio
Ueda, Daiju
EUROPEAN RADIOLOGY, 2025, 35 (04) : 1938 - 1947

← 1 2 3 4 →