Performance of artificial intelligence chatbots in responding to the frequently asked questions of patients regarding dental prostheses

被引:0
作者
Hossein Esmailpour [1 ]
Vanya Rasaie [2 ]
Yasamin Babaee Hemmati [3 ]
Mehran Falahchai [4 ]
机构
[1] School of Dentistry, Guilan University of Medical Sciences, Rasht
[2] Research Affiliate at Sydney Dental School, Faculty of Medicine and Health, Sydney
[3] Department of Orthodontics, Dental Sciences Research Center, School of Dentistry, Guilan University of Medical Sciences, Rasht
[4] Department of Prosthodontics, Dental Sciences Research Center, School of Dentistry, Guilan University of Medical Sciences, Rasht
关键词
Artificial intelligence; Health literacy; Natural Language processing; Patient education as topic; Prosthodontics;
D O I
10.1186/s12903-025-05965-9
中图分类号
学科分类号
摘要
Background: Artificial intelligence (AI) chatbots are increasingly used in healthcare to address patient questions by providing personalized responses. Evaluating their performance is essential to ensure their reliability. This study aimed to assess the performance of three AI chatbots in responding to the frequently asked questions (FAQs) of patients regarding dental prostheses. Methods: Thirty-one frequently asked questions (FAQs) were collected from accredited organizations’ websites and the “People Also Ask” feature of Google, focusing on removable and fixed prosthodontics. Two board-certified prosthodontists evaluated response quality using the modified Global Quality Score (GQS) on a 5-point Likert scale. Inter-examiner agreement was assessed using weighted kappa. Readability was measured using the Flesch-Kincaid Grade Level (FKGL) and Flesch Reading Ease (FRE) indices. Statistical analyses were performed using repeated measures ANOVA and Friedman test, with Bonferroni correction for pairwise comparisons (α = 0.05). Results: The inter-examiner agreement was good. Among the chatbots, Google Gemini had the highest quality score (4.58 ± 0.50), significantly outperforming Microsoft Copilot (3.87 ± 0.89) (P =.004). Readability analysis showed ChatGPT (10.45 ± 1.26) produced significantly more complex responses compared to Gemini (7.82 ± 1.19) and Copilot (8.38 ± 1.59) (P <.001). FRE scores indicated that ChatGPT’s responses were categorized as fairly difficult (53.05 ± 7.16), while Gemini’s responses were in plain English (64.94 ± 7.29), with a significant difference between them (P <.001). Conclusions: AI chatbots show great potential in answering patient inquiries about dental prostheses. However, improvements are needed to enhance their effectiveness as patient education tools. © The Author(s) 2025.
引用
收藏
相关论文
共 57 条
  • [1] Zhang P., Kamel Boulos M.N., Generative AI in medicine and healthcare: promises, opportunities and challenges, Future Internet, 15, 9, (2023)
  • [2] Mesko B., Topol E.J., The imperative for regulatory oversight of large Language models (or generative AI) in healthcare, NPJ Digit Med, 6, 1, (2023)
  • [3] 19, 6, (2023)
  • [4] Reddy S., Evaluating large Language models for use in healthcare: A framework for translational value assessment, Inf Med Unlocked, 41, (2023)
  • [5] Wang Y., Zhao Y., Petzold L., Are large Language models ready for healthcare? A comparative study on clinical Language Understanding, PMLR, pp. 804-823, (2023)
  • [6] Alhur A., The contributions of ChatGPT, Gemini, and Co-pilot, Cureus Apr, 16, 4, (2024)
  • [7] Ray P.P., ChatGPT: A comprehensive review on background, applications, key challenges, bias, ethics, limitations and future scope, Internet Things Cyber-Physical Syst, 3, pp. 121-154, (2023)
  • [8] Roumeliotis K.I., Tselikas N.D., Chatgpt and open-ai models: A preliminary review, Future Internet, 15, 6, (2023)
  • [9] Goodman R.S., Patrinely J.R., Stone C.A., Et al., Accuracy and reliability of chatbot responses to physician questions, JAMA Netw Open, 6, 10, pp. e2336483-2336483, (2023)
  • [10] Rane N., Choudhary S., Rane J., Gemini versus ChatGPT: applications, performance, architecture, capabilities, and implementation. Performance, Architecture, Capabilities, and Implementation, 2024, (2024)