Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement

被引:8
作者
Zhang, Siyuan [1 ]
Liau, Zi Qiang Glen [1 ]
Tan, Kian Loong Melvin [1 ]
Chua, Wei Liang [1 ]
机构
[1] Natl Univ Hlth Syst, Dept Orthopaed Surg, Level 11,NUHS Tower Block,1E Kent Ridge Rd, Singapore 119228, Singapore
关键词
ChatGPT; Artificial intelligence; Chatbot; Large language model; Total knee replacement; Total knee arthroplasty; ARTHROPLASTY;
D O I
10.1186/s43019-024-00218-5
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Background Chat Generative Pretrained Transformer (ChatGPT), a generative artificial intelligence chatbot, may have broad applications in healthcare delivery and patient education due to its ability to provide human-like responses to a wide range of patient queries. However, there is limited evidence regarding its ability to provide reliable and useful information on orthopaedic procedures. This study seeks to evaluate the accuracy and relevance of responses provided by ChatGPT to frequently asked questions (FAQs) regarding total knee replacement (TKR).Methods A list of 50 clinically-relevant FAQs regarding TKR was collated. Each question was individually entered as a prompt to ChatGPT (version 3.5), and the first response generated was recorded. Responses were then reviewed by two independent orthopaedic surgeons and graded on a Likert scale for their factual accuracy and relevance. These responses were then classified into accurate versus inaccurate and relevant versus irrelevant responses using preset thresholds on the Likert scale.Results Most responses were accurate, while all responses were relevant. Of the 50 FAQs, 44/50 (88%) of ChatGPT responses were classified as accurate, achieving a mean Likert grade of 4.6/5 for factual accuracy. On the other hand, 50/50 (100%) of responses were classified as relevant, achieving a mean Likert grade of 4.9/5 for relevance.Conclusion ChatGPT performed well in providing accurate and relevant responses to FAQs regarding TKR, demonstrating great potential as a tool for patient education. However, it is not infallible and can occasionally provide inaccurate medical information. Patients and clinicians intending to utilize this technology should be mindful of its limitations and ensure adequate supervision and verification of information provided.
引用
收藏
页数:8
相关论文
共 50 条
  • [21] Assessing the accuracy and utility of ChatGPT responses to patient questions regarding posterior lumbar decompression
    Giakas, Alec M.
    Narayanan, Rajkishen
    Ezeonu, Teeto
    Dalton, Jonathan
    Lee, Yunsoo
    Henry, Tyler
    Mangan, John
    Schroeder, Gregory
    Vaccaro, Alexander
    Kepler, Christopher
    ARTIFICIAL INTELLIGENCE SURGERY, 2024, 4 (03): : 233 - 246
  • [22] ChatGPT Responses to Frequently Asked Questions on Meniere's Disease: A Comparison to Clinical Practice Guideline Answers
    Ho, Rebecca A.
    Shaari, Ariana L.
    Cowan, Paul T.
    Yan, Kenneth
    OTO OPEN, 2024, 8 (03)
  • [23] Evaluating ChatGPT as a patient resource for frequently asked questions about lung cancer surgery-a pilot study
    Ferrari-Light, Dana
    Merritt, Robert E.
    D'Souza, Desmond
    Ferguson, Mark K.
    Harrison, Sebron
    Madariaga, Maria Lucia
    Lee, Benjamin E.
    Moffatt-Bruce, Susan D.
    Kneuertz, Peter J.
    JOURNAL OF THORACIC AND CARDIOVASCULAR SURGERY, 2025, 169 (04)
  • [24] Appropriateness of Frequently Asked Patient Questions Following Total Hip Arthroplasty From ChatGPT Compared to Arthroplasty-Trained Nurses
    Dubin, Jeremy A.
    Bains, Sandeep S.
    DeRogatis, Michael J.
    Moore, Mallory C.
    Hameed, Daniel
    Mont, Michael A.
    Nace, James
    Delanois, Ronald E.
    JOURNAL OF ARTHROPLASTY, 2024, 39 (09) : S306 - S311
  • [25] Evaluating the accuracy of Chat Generative Pre-trained Transformer version 4 (ChatGPT-4) responses to United States Food and Drug Administration (FDA) frequently asked questions about dental amalgam
    Buldur, Mehmet
    Sezer, Berkant
    BMC ORAL HEALTH, 2024, 24 (01):
  • [26] Do ChatGPT and Google differ in answers to commonly asked patient questions regarding total shoulder and total elbow arthroplasty?
    Tharakan, Shebin
    Klein, Brandon
    Bartlett, Lucas
    Atlas, Aaron
    Parada, Stephen A.
    Cohn, Randy M.
    JOURNAL OF SHOULDER AND ELBOW SURGERY, 2024, 33 (08) : e429 - e437
  • [27] Enhancing responses from large language models with role-playing prompts: a comparative study on answering frequently asked questions about total knee arthroplasty
    Yi-Chen Chen
    Sheng-Hsun Lee
    Huan Sheu
    Sheng-Hsuan Lin
    Chih-Chien Hu
    Shih-Chen Fu
    Cheng-Pang Yang
    Yu-Chih Lin
    BMC Medical Informatics and Decision Making, 25 (1)
  • [28] Evaluating accuracy and reproducibility of ChatGPT responses to patient-based questions in Ophthalmology: An observational study
    Alqudah, Asem A.
    Aleshawi, Abdelwahab J.
    Baker, Mohammed
    Alnajjar, Zaina
    Ayasrah, Ibrahim
    Ta'ani, Yaqoot
    Al Salkhadi, Mohammad
    Aljawarneh, Shaima'a
    MEDICINE, 2024, 103 (32)
  • [29] Assessing ChatGPT Responses to Common Patient Questions Regarding Total Hip Arthroplasty
    Mika, Aleksander P.
    Martin, J. Ryan
    Engstrom, Stephen M.
    Polkowski, Gregory G.
    Wilson, Jacob M.
    JOURNAL OF BONE AND JOINT SURGERY-AMERICAN VOLUME, 2023, 105 (19) : 1519 - 1526
  • [30] ChatGPT provides acceptable responses to patient questions regarding common shoulder pathology
    Ghilzai, Umar
    Fiedler, Benjamin
    Ghali, Abdullah
    Singh, Aaron
    Cass, Benjamin
    Young, Allan
    Ahmed, Adil Shahzad
    SHOULDER & ELBOW, 2024,