Performance of large language models in oral and maxillofacial surgery examinations

被引:2
作者
Quah, B. [1 ,2 ]
Yong, C. W. [1 ,2 ]
Lai, C. W. M. [1 ]
Islam, I. [1 ,2 ]
机构
[1] Natl Univ Singapore, Fac Dent, 9 Lower Kent Ridge Rd, Singapore 119085, Singapore
[2] Natl Univ Ctr Oral Hlth, Discipline Oral & Maxillofacial Surg, Singapore, Singapore
关键词
Artificial intelligence; Oral surgery; Dental education; Academic performance; Dentistry;
D O I
10.1016/j.ijom.2024.06.003
中图分类号
R78 [口腔科学];
学科分类号
1003 ;
摘要
This study aimed to determine the accuracy of large language models (LLMs) in answering oral and maxillofacial surgery (OMS) multiple choice questions. A total of 259 questions from the university's question bank were answered by the LLMs (GPT-3.5, GPT-4, Llama 2, Gemini, and Copilot). The scores per category as well as the total score out of 259 were recorded and evaluated, with the passing score set at 50%. The mean overall score amongst all LLMs was 62.5%. GPT-4 performed the best (76.8%, 95% confidence interval (CI) 71.4-82.2%), followed by Copilot (72.6%, 95% CI 67.2-78.0%), GPT-3.5 (62.2%, 95% CI 56.4-68.0%), Gemini (58.7%, 95% CI 52.9-64.5%), and Llama 2 (42.5%, 95% CI 37.1-48.6%). There was a statistically significant difference between the scores of the five LLMs overall (chi(2) = 79.9, df = 4, P < 0.001) and within all categories except 'basic sciences' (P = 0.129), 'dentoalveolar and implant surgery' (P = 0.052), and 'oral medicine/pathology/radiology' (P = 0.801). The LLMs performed best in 'basic sciences' (68.9%) and poorest in 'pharmacology' (45.9%). The LLMs can be used as adjuncts in teaching, but should not be used for clinical decision-making until the models are further developed and validated.
引用
收藏
页码:881 / 886
页数:6
相关论文
共 50 条
  • [41] The global reach of social media in oral and maxillofacial surgery
    Jack A. Harris
    Nicole A. Beck
    Cassi J. Niedziela
    Gerardo A. Alvarez
    Sheridan A. Danquah
    Salim Afshar
    Oral and Maxillofacial Surgery, 2023, 27 : 513 - 517
  • [42] Advantages and disadvantages of the use of bisphosphonates in oral and maxillofacial surgery
    de Souza Loureiro, Caio Cesar
    Lobo Leandro, Luiz Fernando
    INTERNATIONAL DENTAL JOURNAL, 2010, 60 (04) : 263 - 268
  • [43] KNOWLEDGE GAPS IN ORAL AND MAXILLOFACIAL SURGERY: A SYSTEMATIC MAPPING
    Osterberg, Marie
    Holmlund, Anders
    Sunzel, Bo
    Tranaeus, Sofia
    Twetman, Svante
    Lund, Bodil
    INTERNATIONAL JOURNAL OF TECHNOLOGY ASSESSMENT IN HEALTH CARE, 2017, 33 (01) : 93 - 102
  • [44] Performance of GPT-4 in oral and maxillofacial surgery board exams: challenges in specialized questions
    Felix Benjamin Warwas
    Nils Heim
    Oral and Maxillofacial Surgery, 29 (1)
  • [45] ChatGPT and large language models in orthopedics: from education and surgery to research
    Chatterjee, Srijan
    Bhattacharya, Manojit
    Pal, Soumen
    Lee, Sang-Soo
    Chakraborty, Chiranjib
    JOURNAL OF EXPERIMENTAL ORTHOPAEDICS, 2023, 10 (01)
  • [46] Investigating the role of large language models on questions about refractive surgery
    Demir, Suleyman
    INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2025, 195
  • [47] Oral and Maxillofacial Surgery and Oral Surgery - what's the difference? A Western Australian dental student survey
    Cooper, T.
    Schenberg, K.
    Smith, L.
    Bobinskas, A.
    BRITISH JOURNAL OF ORAL & MAXILLOFACIAL SURGERY, 2020, 58 (10) : 1276 - 1281
  • [48] Oral and Maxillofacial Surgery Curriculum (2021) and Oral Surgery Curriculum (2023): A forensic comparison of two documents
    Capanni, P. M.
    Magill, S.
    Walker, T.
    Varley, I.
    Magennis, P.
    BRITISH JOURNAL OF ORAL & MAXILLOFACIAL SURGERY, 2025, 63 (02) : 125 - 132
  • [49] Will code one day run a code? Performance of language models on ACEM primary examinations and implications
    Smith, Jesse
    Choi, Philip M. C.
    Buntine, Paul
    EMERGENCY MEDICINE AUSTRALASIA, 2023, 35 (05) : 876 - 878
  • [50] View from the Other Side: A Perspective on Oral and Maxillofacial Surgery in a Developing Nation - Bangladesh
    Molla, Motiur Rahman
    Haji, Hussein K.
    Molla, Nafisa Marium
    ORAL AND MAXILLOFACIAL SURGERY CLINICS OF NORTH AMERICA, 2020, 32 (03) : 377 - +