Is ChatGPT more Humane than Humans?-Accuracy, Relevance and humanness of the Answers Provided by ChatGPT for Patient Education for Total Knee Replacement

被引：0

作者：

Bafna, Khushi ^{[1
]}

Yadav, Amit Kumar ^{[1
]}

Bagaria, Shaurya ^{[1
]}

Gianchandani, Ayushi ^{[1
]}

Poduval, Murali ^{[2
]}

Bagaria, Vaibhav ^{[1
]}

机构：

[1] Sir HN Reliance Fdn Hosp, Dept Orthoped, Mumbai, India

[2] Lifesci Engn, Mumbai, India

来源：

INDIAN JOURNAL OF ORTHOPAEDICS | 2025年

关键词：

ChatGPT; Knee; Knee arthroplasty; Turing test; Large Language Model (LLM);

D O I：

10.1007/s43465-025-01474-7

中图分类号：

R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学（修复外科学）];

学科分类号：

摘要：

AimThis study aimed to assess the reliability of orthopaedic information provided by ChatGPT in response to common patient inquiries and concerns related to Total Joint Replacement surgery, focusing on preventing the dissemination of potentially harmful medical advice.MethodThis qualitative exploratory case study was conducted at a tertiary care centre hospital. Ten common questions patients pose to orthopaedic surgeons when considering knee arthroplasty were formulated and presented independently to an Orthopaedic Consultant, Associate Consultant, a Fellow in Joint Replacement surgery, and ChatGPT. For review, the answers were submitted to a panel of three orthopaedic surgeons specializing in arthroplasty. They were scored based on accuracy, relevance, and usefulness, with a maximum possible score of 100.0 points, allowing for 0.5-point increments. They were also asked to identify which answers were from chat GPT, not humans.ResultsChatGPT exhibited the highest total aggregate score of 232.0 points out of a maximum of 300.0 points, surpassing the scores of the human participants (Participant 1: 197.0 points; Participant 2: 227.5 points; Participant 3: 220.5 points). Furthermore, two out of three panel specialists rated ChatGPT the highest. When comparing the average scores for ChatGPT and the human participants for each question, ChatGPT outperformed the human participants in 8 out of 10 questions. Out of the 120 encounter instances, the evaluator could only point correctly that the response was from ChatGPT response 14 times (11.66%).ConclusionThis study highlights the utility and limitations of ChatGPT in the medical field-ChatGPT exhibits great potential in assisting doctors and surgeons in patient care by providing accurate and relevant information. The study also demonstrated that the answers seemed indistinguishable from humans in most cases. In the current landscape of ChatGPT and other AI technologies, their integration in the medical field should be viewed as complementary to human expertise, which must be leveraged for the greater good.

引用

页数：7

共 12 条

[1] The Brave New World of Arthroplasty Needs Both 'Deep Learning' and 'Deep Listening' [J].

Bagaria, Vaibhav ;

Pachore, Javahir .

INDIAN JOURNAL OF ORTHOPAEDICS, 2023, 57 (05) :617-619

[2] Can ChatGPT be used in oral and maxillofacial surgery? [J].

Balel, Yunus .

JOURNAL OF STOMATOLOGY ORAL AND MAXILLOFACIAL SURGERY, 2023, 124 (05)

[3] Behind the ChatGPT Hype: Are Its Suggestions Contributing to Addiction? [J].

Haman, Michael ;

Skolnik, Milan .

ANNALS OF BIOMEDICAL ENGINEERING, 2023, 51 (06) :1128-1129

[4] ChatGPT Is Equivalent to First-Year Plastic Surgery Residents: Evaluation of ChatGPT on the Plastic Surgery In-service Examination [J].

Humar, Pooja ;

Asaad, Malke ;

Bengur, Fuat Baris ;

Nguyen, Vu .

AESTHETIC SURGERY JOURNAL, 2023, 43 (12) :NP1085-NP1089

[5] ChatGPT- Reshaping medical education and clinical management [J].

Khan, Rehan Ahmed ;

Jawaid, Masood ;

Khan, Aymen Rehan ;

Sajjad, Madiha .

PAKISTAN JOURNAL OF MEDICAL SCIENCES, 2023, 39 (02) :605-607

[6]

Luxton David, 2016, An introduction to artificial intelligence in behavioral and mental health care, DOI [10.1016/B978-0-12-420248-1.00001-5, DOI 10.1016/B978-0-12-420248-1.00001-5]

[7] Barriers and facilitators to engagement with artificial intelligence (AI)-based chatbots for sexual and reproductive health advice: a qualitative analysis [J].

Nadarzynski, Tom ;

Puentes, Vannesa ;

Pawlak, Izabela ;

Mendes, Tania ;

Montgomery, Ian ;

Bayley, Jake ;

Ridge, Damien .

SEXUAL HEALTH, 2021, 18 (05) :385-393

[8] Assessing the Accuracy of Responses by the Language Model ChatGPT to Questions Regarding Bariatric Surgery [J].

Samaan, Jamil S. ;

Yeo, Yee Hui ;

Rajeev, Nithya ;

Hawley, Lauren ;

Abel, Stuart ;

Ng, Wee Han ;

Srinivasan, Nitin ;

Park, Justin ;

Burch, Miguel ;

Watson, Rabindra ;

Liran, Omer ;

Samakar, Kamran .

OBESITY SURGERY, 2023, 33 (06) :1790-1796

[9] Evaluating Chatbot Efficacy for Answering Frequently Asked Questions in Plastic Surgery: A ChatGPT Case Study Focused on Breast Augmentation [J].

Seth, Ishith ;

Cox, Aram ;

Xie, Yi ;

Bulloch, Gabriella ;

Hunter-Smith, David J. ;

Rozen, Warren M. ;

Ross, Richard J. .

AESTHETIC SURGERY JOURNAL, 2023, 43 (10) :1126-1135

[10] What if your patient switches from Dr. Google to Dr. ChatGPT? A vignette-based survey of the trustworthiness, value, and danger of ChatGPT-generated responses to health questions [J].

Van Bulck, Liesbet ;

Moons, Philip .

EUROPEAN JOURNAL OF CARDIOVASCULAR NURSING, 2024, 23 (01) :95-98

← 1 2 →