Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement

被引:8
作者
Zhang, Siyuan [1 ]
Liau, Zi Qiang Glen [1 ]
Tan, Kian Loong Melvin [1 ]
Chua, Wei Liang [1 ]
机构
[1] Natl Univ Hlth Syst, Dept Orthopaed Surg, Level 11,NUHS Tower Block,1E Kent Ridge Rd, Singapore 119228, Singapore
关键词
ChatGPT; Artificial intelligence; Chatbot; Large language model; Total knee replacement; Total knee arthroplasty; ARTHROPLASTY;
D O I
10.1186/s43019-024-00218-5
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Background Chat Generative Pretrained Transformer (ChatGPT), a generative artificial intelligence chatbot, may have broad applications in healthcare delivery and patient education due to its ability to provide human-like responses to a wide range of patient queries. However, there is limited evidence regarding its ability to provide reliable and useful information on orthopaedic procedures. This study seeks to evaluate the accuracy and relevance of responses provided by ChatGPT to frequently asked questions (FAQs) regarding total knee replacement (TKR).Methods A list of 50 clinically-relevant FAQs regarding TKR was collated. Each question was individually entered as a prompt to ChatGPT (version 3.5), and the first response generated was recorded. Responses were then reviewed by two independent orthopaedic surgeons and graded on a Likert scale for their factual accuracy and relevance. These responses were then classified into accurate versus inaccurate and relevant versus irrelevant responses using preset thresholds on the Likert scale.Results Most responses were accurate, while all responses were relevant. Of the 50 FAQs, 44/50 (88%) of ChatGPT responses were classified as accurate, achieving a mean Likert grade of 4.6/5 for factual accuracy. On the other hand, 50/50 (100%) of responses were classified as relevant, achieving a mean Likert grade of 4.9/5 for relevance.Conclusion ChatGPT performed well in providing accurate and relevant responses to FAQs regarding TKR, demonstrating great potential as a tool for patient education. However, it is not infallible and can occasionally provide inaccurate medical information. Patients and clinicians intending to utilize this technology should be mindful of its limitations and ensure adequate supervision and verification of information provided.
引用
收藏
页数:8
相关论文
共 50 条
  • [31] Dr. Google vs. Dr. ChatGPT: Exploring the Use of Artificial Intelligence in Ophthalmology by Comparing the Accuracy, Safety, and Readability of Responses to Frequently Asked Patient Questions Regarding Cataracts and Cataract Surgery
    Cohen, Samuel A.
    Brant, Arthur
    Fisher, Ann Caroline
    Pershing, Suzann
    Do, Diana
    Pan, Carolyn
    SEMINARS IN OPHTHALMOLOGY, 2024, 39 (06) : 472 - 479
  • [32] Reliability of artificial intelligence chatbot responses to frequently asked questions in breast surgical oncology
    Roldan-Vasquez, Estefania
    Mitri, Samir
    Bhasin, Shreya
    Bharani, Tina
    Capasso, Kathryn
    Haslinger, Michelle
    Sharma, Ranjna
    James, Ted A.
    JOURNAL OF SURGICAL ONCOLOGY, 2024, 130 (02) : 188 - 203
  • [33] How accurately can ChatGPT 3.5 answer frequently asked questions by patients on glenohumeral osteoarthritis?
    Youssef, Yasmin
    Youssef, Salim
    Melcher, Peter
    Henkelmann, Ralf
    Osterhoff, Georg
    Theopold, Jan
    OBERE EXTREMITAET-SCHULTER-ELLENBOGEN-HAND-UPPER EXTREMITY-SHOULDER ELBOW HAND, 2024,
  • [34] An Assessment of ChatGPT's Responses to Common Patient Questions About Lung Cancer Surgery: A Preliminary Clinical Evaluation of Accuracy and Relevance
    Troian, Marina
    Lovadina, Stefano
    Ravasin, Alice
    Arbore, Alessia
    Aleksova, Aneta
    Baratella, Elisa
    Cortale, Maurizio
    JOURNAL OF CLINICAL MEDICINE, 2025, 14 (05)
  • [35] Performance of artificial intelligence chatbots in responding to the frequently asked questions of patients regarding dental prostheses
    Hossein Esmailpour
    Vanya Rasaie
    Yasamin Babaee Hemmati
    Mehran Falahchai
    BMC Oral Health, 25 (1)
  • [36] Evaluating the comprehension and accuracy of ChatGPT's responses to diabetes-related questions in Urdu compared to English
    Faisal, Seyreen
    Kamran, Tafiya Erum
    Khalid, Rimsha
    Haider, Zaira
    Siddiqui, Yusra
    Saeed, Nadia
    Imran, Sunaina
    Faisal, Romaan
    Jabeen, Misbah
    DIGITAL HEALTH, 2024, 10
  • [37] Patient perceptions regarding minimally invasive total knee replacement
    Kundra, R. K.
    Chowdhry, M.
    Fisher, N.
    Shrestha, B.
    Mathur, K.
    EUROPEAN JOURNAL OF ORTHOPAEDIC SURGERY AND TRAUMATOLOGY, 2009, 19 (03) : 173 - 176
  • [38] ACCURACY OF INTRAMEDULLARY ALIGNMENT IN TOTAL KNEE REPLACEMENT
    ELLOY, MA
    MANNING, MP
    JOHNSON, R
    JOURNAL OF BIOMEDICAL ENGINEERING, 1992, 14 (05): : 363 - 370
  • [39] CAN ARTIFICIAL INTELLIGENCE EFFECTIVELY RESPOND TO FREQUENTLY ASKED QUESTIONS ABOUT FLUORIDE USAGE AND EFFECTS? A QUALITATIVE STUDY ON CHATGPT
    Buldur, Mehmet
    Sezer, Berkant
    FLUORIDE, 2023, 56 (03) : 201 - 216
  • [40] Evaluating ChatGPT's Accuracy in Responding to Patient Education Questions on Acute Kidney Injury and Continuous Renal Replacement Therapy
    Sheikh, Mohammad Salman
    Thongprayoon, Charat
    Suppadungsuk, Supawadee
    Miao, Jing
    Qureshi, Fawad
    Kashani, Kianoush
    Cheungpasitporn, Wisit
    BLOOD PURIFICATION, 2024, 53 (09) : 725 - 731