Evaluating the accuracy and relevance of ChatGPT responses to frequently asked questions regarding total knee replacement

被引:8
作者
Zhang, Siyuan [1 ]
Liau, Zi Qiang Glen [1 ]
Tan, Kian Loong Melvin [1 ]
Chua, Wei Liang [1 ]
机构
[1] Natl Univ Hlth Syst, Dept Orthopaed Surg, Level 11,NUHS Tower Block,1E Kent Ridge Rd, Singapore 119228, Singapore
关键词
ChatGPT; Artificial intelligence; Chatbot; Large language model; Total knee replacement; Total knee arthroplasty; ARTHROPLASTY;
D O I
10.1186/s43019-024-00218-5
中图分类号
R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学(修复外科学)];
学科分类号
摘要
Background Chat Generative Pretrained Transformer (ChatGPT), a generative artificial intelligence chatbot, may have broad applications in healthcare delivery and patient education due to its ability to provide human-like responses to a wide range of patient queries. However, there is limited evidence regarding its ability to provide reliable and useful information on orthopaedic procedures. This study seeks to evaluate the accuracy and relevance of responses provided by ChatGPT to frequently asked questions (FAQs) regarding total knee replacement (TKR).Methods A list of 50 clinically-relevant FAQs regarding TKR was collated. Each question was individually entered as a prompt to ChatGPT (version 3.5), and the first response generated was recorded. Responses were then reviewed by two independent orthopaedic surgeons and graded on a Likert scale for their factual accuracy and relevance. These responses were then classified into accurate versus inaccurate and relevant versus irrelevant responses using preset thresholds on the Likert scale.Results Most responses were accurate, while all responses were relevant. Of the 50 FAQs, 44/50 (88%) of ChatGPT responses were classified as accurate, achieving a mean Likert grade of 4.6/5 for factual accuracy. On the other hand, 50/50 (100%) of responses were classified as relevant, achieving a mean Likert grade of 4.9/5 for relevance.Conclusion ChatGPT performed well in providing accurate and relevant responses to FAQs regarding TKR, demonstrating great potential as a tool for patient education. However, it is not infallible and can occasionally provide inaccurate medical information. Patients and clinicians intending to utilize this technology should be mindful of its limitations and ensure adequate supervision and verification of information provided.
引用
收藏
页数:8
相关论文
共 50 条
  • [41] Can ChatGPT reliably answer the most common patient questions regarding total shoulder arthroplasty?
    White, Christopher A.
    Masturov, Yehuda A.
    Haunschild, Eric
    Michaelson, Evan
    Shukla, Dave R.
    Cagle, Paul J.
    JOURNAL OF SHOULDER AND ELBOW SURGERY, 2025, 34 (05) : e254 - e264
  • [42] High accuracy but limited readability of large language model-generated responses to frequently asked questions about Kienböck's disease
    Asfuroglu, Zeynel Mert
    Yagar, Hilal
    Gumusoglu, Ender
    BMC MUSCULOSKELETAL DISORDERS, 2024, 25 (01)
  • [43] Presentation suitability and readability of ChatGPT's medical responses to patient questions about on knee osteoarthritis
    Yoo, Myungeun
    Jang, Chan Woong
    HEALTH INFORMATICS JOURNAL, 2025, 31 (01)
  • [44] Comment on "Evaluating ChatGPT's Accuracy in Responding to Patient Education Questions on Acute Kidney Injury and Continuous Renal Replacement Therapy"
    Daungsupawong, Hinpetch
    Wiwanitkit, Viroj
    BLOOD PURIFICATION, 2024, 53 (10) : 847 - 848
  • [45] The accuracy of sizing of the femoral component in total knee replacement
    Ng, Fu-Yuen
    Jiang, Xue-Feng
    Zhou, Wen-Zhen
    Chiu, Kwong-Yuen
    Yan, Chun-Hoi
    Fok, Margaret W. M.
    KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY, 2013, 21 (10) : 2309 - 2313
  • [46] The reliability and accuracy of digital templating in total knee replacement
    Trickett, R. W.
    Hodgson, P.
    Forster, M. C.
    Robertson, A.
    JOURNAL OF BONE AND JOINT SURGERY-BRITISH VOLUME, 2009, 91B (07): : 903 - 906
  • [47] Patient perceptions regarding minimally invasive total knee replacement
    R. K. Kundra
    M. Chowdhry
    N. Fisher
    B. Shrestha
    K. Mathur
    European Journal of Orthopaedic Surgery & Traumatology, 2009, 19 : 173 - 176
  • [48] Let's chat about cervical cancer: Assessing the accuracy of ChatGPT responses to cervical cancer questions
    Hermann, Catherine E.
    Patel, Jharna M.
    Boyd, Leslie
    Aviki, Emeline
    Stasenko, Marina
    GYNECOLOGIC ONCOLOGY, 2023, 179 : 164 - 168
  • [49] Appraisal of ChatGPT's responses to common patient questions regarding Tommy John surgery
    Shaari, Ariana L.
    Fano, Adam N.
    Anakwenze, Oke
    Klifto, Christopher
    SHOULDER & ELBOW, 2024, 16 (04) : 429 - 435
  • [50] Digital templating in total knee and hip replacement: an analysis of planning accuracy
    Kniesel, Bettina
    Konstantinidis, Lukas
    Hirschmueller, Anja
    Suedkamp, Norbert
    Helwig, Peter
    INTERNATIONAL ORTHOPAEDICS, 2014, 38 (04) : 733 - 739