The performance of artificial intelligence language models in board-style dental knowledge assessment A preliminary study on ChatGPT

被引:17
|
作者
Danesh, Arman [1 ,5 ]
Pazouki, Hirad [2 ]
Danesh, Kasra [3 ]
Danesh, Farzad
Danesh, Arsalan [4 ]
机构
[1] Western Univ, Schulich Sch Med & Dent, London, ON, Canada
[2] Western Univ, Fac Hlth Sci, London, ON, Canada
[3] Florida Atlantic Univ, Coll Engn & Comp Sci, Boca Raton, FL 33431 USA
[4] Nova Southeastern Univ, Coll Dent Med, Dept Periodontol, Ft Lauderdale, FL 33314 USA
[5] Nova Southeastern Univ, Coll Dent Med, Dept Oral & Maxillofacial Surg, 3200 S Univ Dr, Davie, FL 33328 USA
来源
JOURNAL OF THE AMERICAN DENTAL ASSOCIATION | 2023年 / 154卷 / 11期
关键词
Artificial intelligence; ChatGPT; dental board examination; dental education; dentistry; Integrated National Board Dental Examination;
D O I
10.1016/j.adaj.2023.07.016
中图分类号
R78 [口腔科学];
学科分类号
1003 ;
摘要
Background. Although Chat Generative Pre-trained Transformer (ChatGPT) (OpenAI) may be an appealing educational resource for students, the chatbot responses can be subject to misinfor-mation. This study was designed to evaluate the performance of ChatGPT on a board-style mul-tiple-choice dental knowledge assessment to gauge its capacity to output accurate dental content and in turn the risk of misinformation associated with use of the chatbot as an educational resource by dental students.Methods. ChatGPT3.5 and ChatGPT4 were asked questions obtained from 3 different sources: INBDE Bootcamp, ITDOnline, and a list of board-style questions provided by the Joint Commis-sion on National Dental Examinations. Image-based questions were excluded, as ChatGPT only takes text-based inputs. The mean performance across 3 trials was reported for each model.Results. ChatGPT3.5 and ChatGPT4 answered 61.3% and 76.9% of the questions correctly on average, respectively. A 2-tailed t test was used to compare 2 independent sample means, and a 2-tailed c2 test was used to compare 2 sample proportions. A P value less than .05 was considered to be statistically significant.Conclusion. ChatGPT3.5 did not perform sufficiently well on the board-style knowledge assess-ment. ChatGPT4, however, displayed a competent ability to output accurate dental content. Future research should evaluate the proficiency of emerging models of ChatGPT in dentistry to assess its evolving role in dental education. Practical Implications. Although ChatGPT showed an impressive ability to output accurate dental content, our findings should encourage dental students to incorporate ChatGPT to sup-plement their existing learning program instead of using it as their primary learning resource.
引用
收藏
页码:970 / 974
页数:5
相关论文
共 50 条
  • [1] Artificial Intelligence for Anesthesiology Board-Style Examination Questions: Role of Large Language Models
    Khan, Adnan A.
    Yunus, Rayaan
    Sohail, Mahad
    Rehman, Taha A.
    Saeed, Shirin
    Bu, Yifan
    Jackson, Cullen D.
    Sharkey, Aidan
    Mahmood, Feroze
    Matyal, Robina
    JOURNAL OF CARDIOTHORACIC AND VASCULAR ANESTHESIA, 2024, 38 (05) : 1251 - 1259
  • [2] evaluation of ChatGPT pathology knowledge using board-style questions
    Wiwanitkit, Somsri
    Wiwanitkit, Viroj
    JOURNAL OF STOMATOLOGY ORAL AND MAXILLOFACIAL SURGERY, 2024, 125 (05)
  • [3] Performance of Large Language Models on a Neurology Board-Style Examination
    Schubert, Marc Cicero
    Wick, Wolfgang
    Venkataramani, Varun
    JAMA NETWORK OPEN, 2023, 6 (12) : E2346721
  • [4] Evaluation of ChatGPT pathology knowledge using board-style questions
    Geetha, Saroja D.
    Khan, Anam
    Khan, Atif
    Kannadath, Bijun S.
    Vitkovski, Taisia
    AMERICAN JOURNAL OF CLINICAL PATHOLOGY, 2024, 161 (04) : 393 - 398
  • [5] Performance of Generative Large Language Models on Ophthalmology Board-Style Questions
    Cai, Louis Z.
    Shaheen, Abdulla
    Jin, Andrew
    Fukui, Riya
    Yi, Jonathan S.
    Yannuzzi, Nicolas
    Alabiad, Chrisfouad
    AMERICAN JOURNAL OF OPHTHALMOLOGY, 2023, 254 : 141 - 149
  • [6] ChatGPT versus the neurosurgical written boards: a comparative analysis of artificial intelligence/machine learning performance on neurosurgical board-style questions
    Hopkins, Benjamin S.
    Nguyen, Vincent N.
    Dallas, Jonathan
    Texakalidis, Pavlos
    Yang, Max
    Renn, Alex
    Guerra, Gage
    Kashif, Zain
    Cheok, Stephanie
    Zada, Gabriel
    Mack, William J.
    JOURNAL OF NEUROSURGERY, 2023, 139 (03) : 904 - 911
  • [7] Artificial Intelligence Showdown in Gastroenterology: A Comparative Analysis of Large Language Models (LLMs) in Tackling Board-Style Review Questions
    Shah, Kevin P.
    Dey, Shirin A.
    Pothula, Shravya
    Abud, Arnold
    Jain, Sukrit
    Srivastava, Aniruddha
    Dommaraju, Sagar
    Komanduri, Srinadh
    AMERICAN JOURNAL OF GASTROENTEROLOGY, 2024, 119 (10S): : S1567 - S1568
  • [8] <hr>Inadequate Performance of ChatGPT on Orthopedic Board-Style Written Exams
    Sparks, Chandler A.
    Kraeutler, Matthew J.
    Chester, Grace A.
    Contrada, Edward, V
    Zhu, Eric
    Fasulo, Sydney M.
    Scillia, Anthony J.
    CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (06)
  • [9] Performance of ChatGPT on Solving Orthopedic Board-Style Questions: A Comparative Analysis of ChatGPT 3.5 and ChatGPT 4
    Kim, Sung Eun
    Lee, Ji Han
    Choi, Byung Sun
    Han, Hyuk-Soo
    Lee, Myung Chul
    Ro, Du Hyun
    CLINICS IN ORTHOPEDIC SURGERY, 2024, 16 (04) : 669 - 673
  • [10] Performance of ChatGPT on a Radiology Board-style Examination: Insights into Current Strengths and Limitations
    Bhayana, Rajesh
    Krishna, Satheesh
    Bleakney, Robert R.
    RADIOLOGY, 2023, 307 (05)