The performance of artificial intelligence language models in board-style dental knowledge assessment A preliminary study on ChatGPT

被引：30

作者：

Danesh, Arman ^{[1
,5
]}

Pazouki, Hirad ^{[2
]}

Danesh, Kasra ^{[3
]}

Danesh, Farzad

Danesh, Arsalan ^{[4
]}

机构：

[1] Western Univ, Schulich Sch Med & Dent, London, ON, Canada

[2] Western Univ, Fac Hlth Sci, London, ON, Canada

[3] Florida Atlantic Univ, Coll Engn & Comp Sci, Boca Raton, FL 33431 USA

[4] Nova Southeastern Univ, Coll Dent Med, Dept Periodontol, Ft Lauderdale, FL 33314 USA

[5] Nova Southeastern Univ, Coll Dent Med, Dept Oral & Maxillofacial Surg, 3200 S Univ Dr, Davie, FL 33328 USA

来源：

JOURNAL OF THE AMERICAN DENTAL ASSOCIATION | 2023年 / 154卷 / 11期

关键词：

Artificial intelligence; ChatGPT; dental board examination; dental education; dentistry; Integrated National Board Dental Examination;

D O I：

10.1016/j.adaj.2023.07.016

中图分类号：

R78 [口腔科学];

学科分类号：

1003 ;

摘要：

Background. Although Chat Generative Pre-trained Transformer (ChatGPT) (OpenAI) may be an appealing educational resource for students, the chatbot responses can be subject to misinfor-mation. This study was designed to evaluate the performance of ChatGPT on a board-style mul-tiple-choice dental knowledge assessment to gauge its capacity to output accurate dental content and in turn the risk of misinformation associated with use of the chatbot as an educational resource by dental students.Methods. ChatGPT3.5 and ChatGPT4 were asked questions obtained from 3 different sources: INBDE Bootcamp, ITDOnline, and a list of board-style questions provided by the Joint Commis-sion on National Dental Examinations. Image-based questions were excluded, as ChatGPT only takes text-based inputs. The mean performance across 3 trials was reported for each model.Results. ChatGPT3.5 and ChatGPT4 answered 61.3% and 76.9% of the questions correctly on average, respectively. A 2-tailed t test was used to compare 2 independent sample means, and a 2-tailed c2 test was used to compare 2 sample proportions. A P value less than .05 was considered to be statistically significant.Conclusion. ChatGPT3.5 did not perform sufficiently well on the board-style knowledge assess-ment. ChatGPT4, however, displayed a competent ability to output accurate dental content. Future research should evaluate the proficiency of emerging models of ChatGPT in dentistry to assess its evolving role in dental education. Practical Implications. Although ChatGPT showed an impressive ability to output accurate dental content, our findings should encourage dental students to incorporate ChatGPT to sup-plement their existing learning program instead of using it as their primary learning resource.

引用

页码：970 / 974

页数：5

共 50 条

[21] Performance of three artificial intelligence (AI)-based large language models in standardized testing; implications for AI-assisted dental education [J].

Sabri, Hamoun ;

Saleh, Muhammad H. A. ;

Hazrati, Parham ;

Merchant, Keith ;

Misch, Jonathan ;

Kumar, Purnima S. ;

Wang, Hom-Lay ;

Barootchi, Shayan .

JOURNAL OF PERIODONTAL RESEARCH, 2025, 60 (02) :121-133

[22] Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard [J].

Ryan Chin Taw Cheong ;

Kenny Peter Pang ;

Samit Unadkat ;

Venkata Mcneillis ;

Andrew Williamson ;

Jonathan Joseph ;

Premjit Randhawa ;

Peter Andrews ;

Vinidh Paleri .

European Archives of Oto-Rhino-Laryngology, 2024, 281 :2137-2143

[23] Performance of artificial intelligence chatbots in sleep medicine certification board exams: ChatGPT versus Google Bard [J].

Cheong, Ryan Chin Taw ;

Pang, Kenny Peter ;

Unadkat, Samit ;

Mcneillis, Venkata ;

Williamson, Andrew ;

Joseph, Jonathan ;

Randhawa, Premjit ;

Andrews, Peter ;

Paleri, Vinidh .

EUROPEAN ARCHIVES OF OTO-RHINO-LARYNGOLOGY, 2024, 281 (04) :2137-2143

[24] A Generative Artificial Intelligence Using Multilingual Large Language Models for ChatGPT Applications [J].

Tuan, Nguyen Trung ;

Moore, Philip ;

Thanh, Dat Ha Vu ;

Pham, Hai Van .

APPLIED SCIENCES-BASEL, 2024, 14 (07)

[25] Artificial intelligence chatbots and large language models in dental education: Worldwide survey of educators [J].

Uribe, Sergio E. ;

Maldupa, Ilze ;

Kavadella, Argyro ;

El Tantawi, Maha ;

Chaurasia, Akhilanand ;

Fontana, Margherita ;

Marino, Rodrigo ;

Innes, Nicola ;

Schwendicke, Falk .

EUROPEAN JOURNAL OF DENTAL EDUCATION, 2024, 28 (04) :865-876

[26] Generative Artificial Intelligence Through ChatGPT and Other Large Language Models in Ophthalmology Clinical Applications and Challenges [J].

Tan, Ting Fang ;

Thirunavukarasu, Arun James ;

Campbell, J. Peter ;

Keane, Pearse A. ;

Pasquale, Louis R. ;

Abramoff, Michael D. ;

Kalpathy-Cramer, Jayashree ;

Lum, Flora ;

Kim, Judy E. ;

Baxter, Sally L. ;

Ting, Daniel Shu Wei .

OPHTHALMOLOGY SCIENCE, 2023, 3 (04)

[27] Assessment of knowledge and awareness of artificial intelligence and its uses in dentistry among dental students [J].

Vamshi Ram, V. ;

Sadeep, Hima .

JOURNAL OF PHARMACEUTICAL NEGATIVE RESULTS, 2022, 13 :1304-1309

[28] Artificial intelligence and social intelligence: preliminary comparison study between AI models and psychologists [J].

Sufyan, Nabil Saleh ;

Fadhel, Fahmi H. ;

Alkhathami, Saleh Safeer ;

Mukhadi, Jubran Y. A. .

FRONTIERS IN PSYCHOLOGY, 2024, 15

[29] Performance of artificial intelligence on Turkish dental specialization exam: can ChatGPT-4.0 and gemini advanced achieve comparable results to humans? [J].

Sismanoglu, Soner ;

Capan, Belen Sirinoglu .

BMC MEDICAL EDUCATION, 2025, 25 (01)

[30] Assessment of Artificial Intelligence Platforms With Regard to Medical Microbiology Knowledge: An Analysis of ChatGPT and Gemini [J].

Ranjan, Jai ;

Ahmad, Absar ;

Subudhi, Monalisa ;

Kumar, Ajay .

CUREUS JOURNAL OF MEDICAL SCIENCE, 2024, 16 (05)

← 1 2 3 4 5 →