"Doctor ChatGPT, Can You Help Me?"The Patient's Perspective:Cross-Sectional Study

被引：9

作者：

Armbruster, Jonas ^{[1
]}

Bussmann, Florian ^{[1
]}

Rothhaas, Catharina ^{[1
]}

Titze, Nadine ^{[1
]}

Gruetzner, Paul Alfred ^{[1
]}

Freischmidt, Holger ^{[1
]}

机构：

[1] BG Klin Ludwigshafen, Dept Trauma Surg & Orthopaed, Ludwig Guttmann Str 13, D-67071 Ludwigshafen, Germany

来源：

JOURNAL OF MEDICAL INTERNET RESEARCH | 2024年 / 26卷

关键词：

artificial intelligence; AI; large language models; LLM; ChatGPT; patient education; patient information; patient perceptions; chatbot; chatbots; empathy; QUESTIONS;

D O I：

10.2196/58831

中图分类号：

R19 [保健组织与事业（卫生事业管理）];

学科分类号：

摘要：

Background: Artificial intelligence and the language models derived from it, such as ChatGPT, offer immense possibilities, particularly in the field of medicine. It is already evident that ChatGPT can provide adequate and, in some cases, expert-level responses to health-related queries and advice for patients. However, it is currently unknown how patients perceive these capabilities, whether they can derive benefit from them, and whether potential risks, such as harmful suggestions, are detected by patients. Objective: This study aims to clarify whether patients can get useful and safe health care advice from an artificial intelligence chatbot assistant. Methods: This cross-sectional study was conducted using 100 publicly available health-related questions from 5 medical specialties (trauma, general surgery, otolaryngology, pediatrics, and internal medicine) from a web-based platform for patients. Responses generated by ChatGPT-4.0 and by an expert panel (EP) of experienced physicians from the aforementioned web-based platform were packed into 10 sets consisting of 10 questions each. The blinded evaluation was carried out by patients regarding empathy and usefulness (assessed through the question: "Would this answer have helped you?") on a scale from 1 to 5. As a control, evaluation was also performed by 3 physicians in each respective medical specialty, who were additionally asked about the potential harm of the response and its correctness. Results: In total, 200 sets of questions were submitted by 64 patients (mean 45.7, SD 15.9 years; 29/64, 45.3% male), resulting in 2000 evaluated answers of ChatGPT and the EP each. ChatGPT scored higher in terms of empathy (4.18 vs 2.7; P<.001) and usefulness (4.04 vs 2.98; P<.001). Subanalysis revealed a small bias in terms of levels of empathy given by women in comparison with men (4.46 vs 4.14; P=.049). Ratings of ChatGPT were high regardless of the participant's age. The same highly significant results were observed in the evaluation of the respective specialist physicians. ChatGPT outperformed significantly in correctness (4.51 vs 3.55; P<.001). Specialists rated the usefulness (3.93 vs 4.59) and correctness (4.62 vs 3.84) significantly lower in potentially harmful responses from ChatGPT (P<.001). This was not the case among patients. Conclusions: The results indicate that ChatGPT is capable of supporting patients in health-related queries better than physicians, at least in terms of written advice through a web-based platform. In this study, ChatGPT's responses had a lower percentage of potentially harmful advice than the web-based EP. However, it is crucial to note that this finding is based on a specific study design and may not generalize to all health care settings. Alarmingly, patients are not able to independently recognize these potential dangers.

引用

页数：13

共 33 条

[1] Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum [J].

Ayers, John W. ;

Poliak, Adam ;

Dredze, Mark ;

Leas, Eric C. ;

Zhu, Zechariah ;

Kelley, Jessica B. ;

Faix, Dennis J. ;

Goodman, Aaron M. ;

Longhurst, Christopher A. ;

Hogarth, Michael ;

Smith, Davey M. .

JAMA INTERNAL MEDICINE, 2023, 183 (06) :589-596

[2] Can ChatGPT be used in oral and maxillofacial surgery? [J].

Balel, Yunus .

JOURNAL OF STOMATOLOGY ORAL AND MAXILLOFACIAL SURGERY, 2023, 124 (05)

[3] Comparison of Ophthalmologist and Large Language Model Chatbot Responses to Online Patient Eye Care Questions [J].

Bernstein, Isaac A. ;

Zhang, Youchen ;

Govil, Devendra ;

Majid, Iyad ;

Chang, Robert T. ;

Sun, Yang ;

Shue, Ann ;

Chou, Jonathan C. ;

Schehlein, Emily ;

Christopher, Karen L. ;

Groth, Sylvia L. ;

Ludwig, Cassie ;

Wang, Sophia Y. .

JAMA NETWORK OPEN, 2023, 6 (08)

[4]

Daher Mohammad, 2023, JSES Int, V7, P2534, DOI 10.1016/j.jseint.2023.07.018

[5] ChatGPT and the rise of large language models: the new AI-driven infodemic threat in public health [J].

De Angelis, Luigi ;

Baglivo, Francesco ;

Arzilli, Guglielmo ;

Privitera, Gaetano Pierpaolo ;

Ferragina, Paolo ;

Tozzi, Alberto Eugenio ;

Rizzo, Caterina .

FRONTIERS IN PUBLIC HEALTH, 2023, 11

[6] Beyond human expertise: the promise and limitations of ChatGPT in suicide risk assessment [J].

Elyoseph, Zohar ;

Levkovich, Inbar .

FRONTIERS IN PSYCHIATRY, 2023, 14

[7] How appropriate are answers of online chat-based artificial intelligence (ChatGPT) to common questions on colon cancer? [J].

Emile, Sameh Hany ;

Horesh, Nir ;

Freund, Michael ;

Pellino, Gianluca ;

Oliveira, Lucia ;

Wignakumar, Anjelli ;

Wexner, Steven D. .

SURGERY, 2023, 174 (05) :1273-1275

[8] AI-Powered Chatbots in Medical Education: Potential Applications and Implications [J].

Ghorashi, Nima ;

Ismail, Ahmed ;

Ghosh, Pritha ;

Sidawy, Anton ;

Javan, Ramin .

CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (08)

[9] Accuracy and Reliability of Chatbot Responses to Physician Questions [J].

Goodman, Rachel S. ;

Patrinely, J. Randall ;

Stone, Cosby A. ;

Zimmerman, Eli ;

Donald, Rebecca R. ;

Chang, Sam S. ;

Berkowitz, Sean T. ;

Finn, Avni P. ;

Jahangir, Eiman ;

Scoville, Elizabeth A. ;

Reese, Tyler S. ;

Friedman, Debra L. ;

Bastarache, Julie A. ;

van der Heijden, Yuri F. ;

Wright, Jordan J. ;

Ye, Fei ;

Carter, Nicholas ;

Alexander, Matthew R. ;

Choe, Jennifer H. ;

Chastain, Cody A. ;

Zic, John A. ;

Horst, Sara N. ;

Turker, Isik ;

Agarwal, Rajiv ;

Osmundson, Evan ;

Idrees, Kamran ;

Kiernan, Colleen M. ;

Padmanabhan, Chandrasekhar ;

Bailey, Christina E. ;

Schlegel, Cameron E. ;

Chambless, Lola B. ;

Gibson, Michael K. ;

Osterman, Travis J. ;

Wheless, Lee E. ;

Johnson, Douglas B. .

JAMA NETWORK OPEN, 2023, 6 (10)

[10] ChatGPT With GPT-4 Outperforms Emergency Department Physicians in Diagnostic Accuracy: Retrospective Analysis [J].

Hoppe, John Michael ;

Auer, Matthias K. ;

Strueven, Anna ;

Massberg, Steffen ;

Stremmel, Christopher .

JOURNAL OF MEDICAL INTERNET RESEARCH, 2024, 26

← 1 2 3 4 →