Assessment of Artificial Intelligence Chatbot Responses to Common Patient Questions on Bone Sarcoma

被引:0
|
作者
Khabaz, Kameel [1 ]
Newman-Hung, Nicole J. [2 ]
Kallini, Jennifer R. [2 ]
Kendal, Joseph [3 ]
Christ, Alexander B. [2 ]
Bernthal, Nicholas M. [2 ]
Wessel, Lauren E. [2 ]
机构
[1] David Geffen Sch Med UCLA, Los Angeles, CA 90095 USA
[2] Univ Calif Los Angeles, Dept Orthopaed Surg, Los Angeles, CA USA
[3] Univ Calgary, Dept Surg, Calgary, AB, Canada
关键词
artificial intelligence; bone sarcomas; chatbots; chondrosarcoma; Ewing sarcoma; osteosarcoma;
D O I
10.1002/jso.27966
中图分类号
R73 [肿瘤学];
学科分类号
100214 ;
摘要
Background and ObjectivesThe potential impacts of artificial intelligence (AI) chatbots on care for patients with bone sarcoma is poorly understood. Elucidating potential risks and benefits would allow surgeons to define appropriate roles for these tools in clinical care.MethodsEleven questions on bone sarcoma diagnosis, treatment, and recovery were inputted into three AI chatbots. Answers were assessed on a 5-point Likert scale for five clinical accuracy metrics: relevance to the question, balance and lack of bias, basis on established data, factual accuracy, and completeness in scope. Responses were quantitatively assessed for empathy and readability. The Patient Education Materials Assessment Tool (PEMAT) was assessed for understandability and actionability.ResultsChatbots scored highly on relevance (4.24) and balance/lack of bias (4.09) but lower on basing responses on established data (3.77), completeness (3.68), and factual accuracy (3.66). Responses generally scored well on understandability (84.30%), while actionability scores were low for questions on treatment (64.58%) and recovery (60.64%). GPT-4 exhibited the highest empathy (4.12). Readability scores averaged between 10.28 for diagnosis questions to 11.65 for recovery questions.ConclusionsWhile AI chatbots are promising tools, current limitations in factual accuracy and completeness, as well as concerns of inaccessibility to populations with lower health literacy, may significantly limit their clinical utility.
引用
收藏
页数:6
相关论文
共 50 条
  • [1] Evaluating the Quality of Artificial Intelligence Chatbot Responses to Patient Questions on Bladder Cancer
    Collin, Harry
    Roberts, Matthew
    ASIA-PACIFIC JOURNAL OF CLINICAL ONCOLOGY, 2023, 19 : 67 - 67
  • [2] Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures
    Gajjar, Avi A.
    Kumar, Rohit Prem
    Paliwoda, Ethan D.
    Kuo, Cathleen C.
    Adida, Samuel
    Legarreta, Andrew D.
    Deng, Hansen
    Anand, Sharath Kumar
    Hamilton, D. Kojo
    Agarwal, Nitin
    Buell, Thomas J.
    Gerszten, Peter C.
    Hudson, Joseph S.
    NEUROSURGERY, 2024, 95 (01) : 171 - 178
  • [3] In Reply: Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures
    Gajjar, Avi A.
    Prem Kumar, Rohit
    Hamilton, David Kojo
    Buell, Thomas J.
    Agarwal, Nitin
    Gerszten, Peter C.
    Hudson, Joseph S.
    NEUROSURGERY, 2024, 95 (03) : e98 - e98
  • [4] Letter: Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures
    Daungsupawongm, Hinpetch
    Wiwanitkit, Viroj
    NEUROSURGERY, 2024, 95 (03) : e97 - e97
  • [5] Commentary: Usefulness and Accuracy of Artificial Intelligence Chatbot Responses to Patient Questions for Neurosurgical Procedures
    MacNeil, Andrew J.
    Dagra, Abeer
    Lucke-Wold, Brandon
    NEUROSURGERY, 2024, 95 (01) : e10 - e11
  • [6] Artificial intelligence-based ChatGPT chatbot responses for patient and parent questions on vernal keratoconjunctivitis
    Rasmussen, Marie Louise Roed
    Larsen, Ann-Cathrine
    Subhi, Yousif
    Potapenko, Ivan
    GRAEFES ARCHIVE FOR CLINICAL AND EXPERIMENTAL OPHTHALMOLOGY, 2023, 261 (10) : 3041 - 3043
  • [7] Artificial intelligence-based ChatGPT chatbot responses for patient and parent questions on vernal keratoconjunctivitis
    Marie Louise Roed Rasmussen
    Ann-Cathrine Larsen
    Yousif Subhi
    Ivan Potapenko
    Graefe's Archive for Clinical and Experimental Ophthalmology, 2023, 261 : 3041 - 3043
  • [8] Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum
    Ayers, John W.
    Poliak, Adam
    Dredze, Mark
    Leas, Eric C.
    Zhu, Zechariah
    Kelley, Jessica B.
    Faix, Dennis J.
    Goodman, Aaron M.
    Longhurst, Christopher A.
    Hogarth, Michael
    Smith, Davey M.
    JAMA INTERNAL MEDICINE, 2023, 183 (06) : 589 - 596
  • [9] Physician and Artificial Intelligence Chatbot Responses to Cancer Questions From Social Media
    Chen, David
    Parsa, Rod
    Hope, Andrew
    Hannon, Breffni
    Mak, Ernie
    Eng, Lawson
    Liu, Fei-Fei
    Fallah-Rad, Nazanin
    Heesters, Ann M.
    Raman, Srinivas
    JAMA ONCOLOGY, 2024, 10 (07) : 956 - 960
  • [10] Reliability of artificial intelligence chatbot responses to frequently asked questions in breast surgical oncology
    Roldan-Vasquez, Estefania
    Mitri, Samir
    Bhasin, Shreya
    Bharani, Tina
    Capasso, Kathryn
    Haslinger, Michelle
    Sharma, Ranjna
    James, Ted A.
    JOURNAL OF SURGICAL ONCOLOGY, 2024, 130 (02) : 188 - 203