Performance of ChatGPT 3.5 and 4 on U.S. dental examinations: the INBDE, ADAT, and DAT

被引:5
|
作者
Dashti, Mahmood [1 ]
Ghasemi, Shohreh [2 ]
Ghadimi, Niloofar [3 ]
Hefzi, Delband [4 ]
Karimian, Azizeh [5 ]
Zare, Niusha [6 ]
Fahimipour, Amir [7 ]
Khurshid, Zohaib [8 ]
Chafjiri, Maryam Mohammadalizadeh [9 ]
Ghaedsharaf, Sahar [10 ]
机构
[1] Shahid Beheshti Univ Med Sci, Res Inst Dent Sci, Dentofacial Deform Res Ctr, Tehran, Iran
[2] Queen Mary Coll, Dept Trauma & Craniofacial Reconstruct, London, England
[3] Islamic Azad Univ Med Sci, Dent Sch, Dept Oral & Maxillofacial Radiol, Tehran, Iran
[4] Univ Tehran Med Sci, Sch Dent, Tehran, Iran
[5] Golestan Univ Med Sci, Dent Res Ctr, Dept Biostat, Gorgan, Iran
[6] Univ Southern Calif, Dept Operat Dent, Los Angeles, CA USA
[7] Univ Sydney, Discipline Oral Surg Med & Diagnost, Sch Dent, Fac Med & Hlth,Westmead Ctr Oral Hlth, Sydney, Australia
[8] King Faisal Univ, Dept Prosthodont & Dent Implantol, Al Hasa, Saudi Arabia
[9] Shahid Beheshti Univ Med Sci, Sch Dent, Dept Oral & Maxillofacial Pathol, Tehran, Iran
[10] Shahid Beheshti Univ Med Sci, Sch Dent, Dept Oral & Maxillofacial Radiol, Tehran, Iran
关键词
Artificial Intelligence; Deep Learning; Dentistry; Education; Dental;
D O I
10.5624/isd.20240037
中图分类号
R78 [口腔科学];
学科分类号
1003 ;
摘要
Purpose: Recent advancements in artificial intelligence (AI), particularly tools such as ChatGPT developed by OpenAI, a U.S.-based AI research organization, have transformed the healthcare and education sectors. This study investigated the effectiveness of ChatGPT in answering dentistry exam questions, demonstrating its potential to enhance professional practice and patient care. Materials and Methods: This study assessed the performance of ChatGPT 3.5 and 4 on U.S. dental exams specifically, the Integrated National Board Dental Examination (INBDE), Dental Admission Test (DAT), and ChatGPT's answers were evaluated against official answer sheets. Results: ChatGPT 3.5 and 4 were tested with 253 questions from the INBDE, ADAT, and DAT exams. For the INBDE, both versions achieved 80% accuracy in knowledge-based questions and 66-69% in case history questions. In ADAT, they scored 66-83% in knowledge-based and 76% in case history questions. ChatGPT 4 excelled on the DAT, with 94% accuracy in knowledge-based questions, 57% in mathematical analysis items, and 100% in comprehension questions, surpassing ChatGPT 3.5's rates of 83%, 31%, and 82%, respectively. The difference was significant for knowledge-based questions (P= 0.009). Both versions showed similar patterns in incorrect responses. Conclusion: Both ChatGPT 3.5 and 4 effectively handled knowledge-based, case history, and comprehension questions, with ChatGPT 4 being more reliable and surpassing the performance of 3.5. ChatGPT 4's perfect score in comprehension questions underscores its trainability in specific subjects. However, both versions exhibited weaker performance in mathematical analysis, suggesting this as an area for improvement.
引用
收藏
页码:271 / 275
页数:5
相关论文
共 36 条
  • [21] Assessment Study of ChatGPT-3.5's Performance on the Final Polish Medical Examination: Accuracy in Answering 980 Questions
    Siebielec, Julia
    Ordak, Michal
    Oskroba, Agata
    Dworakowska, Anna
    Bujalska-Zadrozny, Magdalena
    HEALTHCARE, 2024, 12 (16)
  • [22] Comparing the Performance of ChatGPT-4 and Medical Students on MCQs at Varied Levels of Bloom's Taxonomy
    Bharatha, Ambadasu
    Ojeh, Nkemcho
    Rabbi, Ahbab Mohammad Fazle
    Campbell, Michael H.
    Krishnamurthy, Kandamaran
    Layne-Yarde, Rhaheem N. A.
    Kumar, Alok
    Springer, Dale C. R.
    Connell, Kenneth L.
    Majumder, Md Anwarul Azim
    ADVANCES IN MEDICAL EDUCATION AND PRACTICE, 2024, 15 : 393 - 400
  • [23] Comparison of the problem-solving performance of ChatGPT-3.5, ChatGPT-4, Bing Chat, and Bard for the Korean emergency medicine board examination question bank
    Lee, Go Un
    Hong, Dae Young
    Kim, Sin Young
    Kim, Jong Won
    Lee, Young Hwan
    Park, Sang O.
    Lee, Kyeong Ryong
    MEDICINE, 2024, 103 (09) : E37325
  • [24] Accuracy and consistency of ChatGPT-3.5 and-4 in providing differential diagnoses in oral and maxillofacial diseases: a comparative diagnostic performance analysis
    Tomo, Saygo
    Lechien, Jerome R.
    Bueno, Hugo Sobrinho
    Cantieri-Debortoli, Daniela Filie
    Simonato, Luciana Estevam
    CLINICAL ORAL INVESTIGATIONS, 2024, 28 (10)
  • [25] The Relationship Between Cell Phone Use and Academic Performance in a Sample of U.S. College Students
    Lepp, Andrew
    Barkley, Jacob E.
    Karpinski, Aryn C.
    SAGE OPEN, 2015, 5 (01):
  • [26] Caries Risk Assessment/Treatment Programs in U.S. Dental Schools: An Eleven-Year Follow-Up
    Yorty, Jack S.
    Walls, Allan Todd
    Wearden, Stanley
    JOURNAL OF DENTAL EDUCATION, 2011, 75 (01) : 62 - 67
  • [27] The performance of ChatGPT on orthopaedic in-service training exams: A comparative study of the GPT-3.5 turbo and GPT-4 models in orthopaedic education
    Rizzo, Michael G.
    Cai, Nathan
    Constantinescu, David
    JOURNAL OF ORTHOPAEDICS, 2024, 50 : 70 - 75
  • [28] Assessing ChatGPT-4’s performance on the US prosthodontic exam: impact of fine-tuning and contextual prompting vs. base knowledge, a cross-sectional study
    Mahmood Dashti
    Farshad Khosraviani
    Tara Azimi
    Delband Hefzi
    Shohreh Ghasemi
    Amir Fahimipour
    Niusha Zare
    Zohaib Khurshid
    Syed Rashid Habib
    BMC Medical Education, 25 (1)
  • [29] Evaluating ChatGPT-4's performance on oral and maxillofacial queries: Chain of Thought and standard method
    Ji, Kaiyuan
    Wu, Zhihan
    Han, Jing
    Zhai, Guangtao
    Liu, Jiannan
    FRONTIERS IN ORAL HEALTH, 2025, 6
  • [30] Performance of large language models in the National Dental Licensing Examination in China: a comparative analysis of ChatGPT, GPT-4, and New Bing
    Hu, Ziyang
    Xu, Zhe
    Shi, Ping
    Zhang, Dandan
    Yue, Qu
    Zhang, Jiexia
    Lei, Xin
    Lin, Zitong
    INTERNATIONAL JOURNAL OF COMPUTERIZED DENTISTRY, 2024, 27 (04)