Performance of large language models in oral and maxillofacial surgery examinations

被引:2
作者
Quah, B. [1 ,2 ]
Yong, C. W. [1 ,2 ]
Lai, C. W. M. [1 ]
Islam, I. [1 ,2 ]
机构
[1] Natl Univ Singapore, Fac Dent, 9 Lower Kent Ridge Rd, Singapore 119085, Singapore
[2] Natl Univ Ctr Oral Hlth, Discipline Oral & Maxillofacial Surg, Singapore, Singapore
关键词
Artificial intelligence; Oral surgery; Dental education; Academic performance; Dentistry;
D O I
10.1016/j.ijom.2024.06.003
中图分类号
R78 [口腔科学];
学科分类号
1003 ;
摘要
This study aimed to determine the accuracy of large language models (LLMs) in answering oral and maxillofacial surgery (OMS) multiple choice questions. A total of 259 questions from the university's question bank were answered by the LLMs (GPT-3.5, GPT-4, Llama 2, Gemini, and Copilot). The scores per category as well as the total score out of 259 were recorded and evaluated, with the passing score set at 50%. The mean overall score amongst all LLMs was 62.5%. GPT-4 performed the best (76.8%, 95% confidence interval (CI) 71.4-82.2%), followed by Copilot (72.6%, 95% CI 67.2-78.0%), GPT-3.5 (62.2%, 95% CI 56.4-68.0%), Gemini (58.7%, 95% CI 52.9-64.5%), and Llama 2 (42.5%, 95% CI 37.1-48.6%). There was a statistically significant difference between the scores of the five LLMs overall (chi(2) = 79.9, df = 4, P < 0.001) and within all categories except 'basic sciences' (P = 0.129), 'dentoalveolar and implant surgery' (P = 0.052), and 'oral medicine/pathology/radiology' (P = 0.801). The LLMs performed best in 'basic sciences' (68.9%) and poorest in 'pharmacology' (45.9%). The LLMs can be used as adjuncts in teaching, but should not be used for clinical decision-making until the models are further developed and validated.
引用
收藏
页码:881 / 886
页数:6
相关论文
共 50 条
  • [21] Anesthesia Equipment for the Oral and Maxillofacial Surgery Practice
    Chung, William L.
    ORAL AND MAXILLOFACIAL SURGERY CLINICS OF NORTH AMERICA, 2013, 25 (03) : 373 - +
  • [22] Current Nomina Anatomica for oral and maxillofacial surgery
    Trost, O.
    Hardy, H.
    Perona, J. -M.
    Trouilloud, P.
    REVUE DE STOMATOLOGIE DE CHIRURGIE MAXILLO-FACIALE ET DE CHIRURGIE ORALE, 2014, 115 (05) : 287 - 292
  • [23] Psychological issues in oral and maxillofacial reconstructive surgery
    De Sousa, Avinash
    BRITISH JOURNAL OF ORAL & MAXILLOFACIAL SURGERY, 2008, 46 (08) : 661 - 664
  • [24] Perioperative antibiotic prophylaxis in oral and maxillofacial surgery
    Karbach, J.
    Al-Nawas, B.
    MKG-CHIRURG, 2014, 7 (04): : 261 - 267
  • [25] Fresh frozen bone in oral and maxillofacial surgery
    Rodella, Luigi Fabrizio
    Cocchi, Marco Angelo
    Rezzani, Rita
    Procacci, Pasquale
    Hirtler, Lena
    Nocini, Pierfrancesco
    Albanese, Massimo
    JOURNAL OF DENTAL SCIENCES, 2015, 10 (02) : 115 - 122
  • [26] Oral and cranio-maxillofacial surgery in Byzantium
    Mylonas, Anastassios I.
    Poulakou-Rebelakou, Eleftheria-Fotini
    Androutsos, Georgios I.
    Seggas, Ioannis
    Skouteris, Christos A.
    Papadopoulou, Evangelia Chr
    JOURNAL OF CRANIO-MAXILLOFACIAL SURGERY, 2014, 42 (02) : 159 - 168
  • [27] Large Language Models Take on Cardiothoracic Surgery: A Comparative Analysis of the Performance of Four Models on American Board of Thoracic Surgery Exam Questions in 2023
    Khalpey, Zain
    Kumar, Ujjawal
    King, Nicholas
    Abraham, Alyssa
    Khalpey, Amina H.
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (07)
  • [28] Authors' Reply to: Variability in Large Language Models' Responses to Medical Licensing and Certification Examinations
    Gilson, Aidan
    Safranek, Conrad W.
    Huang, Thomas
    Socrates, Vimig
    Chi, Ling
    Taylor, Richard Andrew
    Chartash, David
    JMIR MEDICAL EDUCATION, 2023, 9
  • [29] Large language models and artificial intelligence chatbots in vascular surgery
    Lareyre, Fabien
    Nasr, Bahaa
    Poggi, Elise
    Di Lorenzo, Gilles
    Ballaith, Ali
    Sliti, Imen
    Chaudhuri, Arindam
    Raffort, Juliette
    SEMINARS IN VASCULAR SURGERY, 2024, 7 (03) : 314 - 320
  • [30] How well do large language model-based chatbots perform in oral and maxillofacial radiology?
    Jeong, Hui
    Han, Sang-Sun
    Yu, Youngjae
    Kim, Saejin
    Jeon, Kug Jin
    DENTOMAXILLOFACIAL RADIOLOGY, 2024, 53 (06) : 390 - 395