How accurately can ChatGPT 3.5 answer frequently asked questions by patients on glenohumeral osteoarthritis?

被引：0

作者：

Youssef, Yasmin ^{[1
]}

Youssef, Salim ^{[1
]}

Melcher, Peter ^{[2
]}

Henkelmann, Ralf ^{[1
]}

Osterhoff, Georg ^{[1
]}

Theopold, Jan ^{[1
]}

机构：

[1] Univ Hosp Leipzig, Dept Orthoped Trauma & Plast Surg, Liebig str 20, D-04103 Leipzig, Germany

[2] Helios Klin Leisnig, Unfallchirurgie & Orthopadie, Leisnig, Germany

来源：

OBERE EXTREMITAET-SCHULTER-ELLENBOGEN-HAND-UPPER EXTREMITY-SHOULDER ELBOW HAND | 2024年

关键词：

Artificial intelligence; Generative AI; Shoulder; Patient safety; Education of patients; K & uuml; nstliche Intelligenz; Generative KI; Schulter; Patientensicherheit; Patientenaufkl & auml; rung;

D O I：

10.1007/s11678-024-00836-1

中图分类号：

R826.8 [整形外科学]; R782.2 [口腔颌面部整形外科学]; R726.2 [小儿整形外科学]; R62 [整形外科学（修复外科学）];

学科分类号：

摘要：

BackgroundConversational artificial intelligence (AI) systems like ChatGPT have emerged as valuable assets in providing accessible information across various domains, including the healthcare system. The use of ChatGPT may contribute to better patient education and better general healthcare knowledge. However, there is a paucity of data on the reliability of responses generated by ChatGPT in the context of specific medical diagnoses.MethodsWe identified 12 frequently asked questions by patients about glenohumeral osteoarthritis. These questions were formulated in both English and German, using common and medical terms for the condition, thus creating four groups for evaluation. The questions were then presented to ChatGPT 3.5. The generated responses were evaluated for accuracy by four independent orthopedic and trauma surgery consultants using a Likert scale (0 = fully inaccurate to 4 = fully accurate).ResultsAlthough there were two questions in two groups, all questions across all versions were answered with good accuracy by ChatGPT 3.5. The highest score on the Likert scale was 3.9 for the group where questions were posed in English using the medical term "glenohumeral osteoarthritis." The lowest score of 3.2 was for the group where questions were posed in English using the common term "shoulder arthrosis." On average, questions in English received a score of 3.5 on the Likert scale, slightly higher than those in German, which received a score of 3.4.ConclusionChatGPT 3.5 can already provide accurate responses to patients' frequently asked questions on glenohumeral osteoarthritis. ChatGPT can therefore be a valuable tool for patient communication and education in the field of orthopedics. Further studies, however, have to be performed in order to fully understand the mechanisms and impact of ChatGPT in the field. HintergrundGenerative Systeme der k & uuml;nstlichen Intelligenz (KI) wie ChatGPT haben sich als wertvolle Hilfsmittel f & uuml;r die Bereitstellung von Informationen in verschiedenen Bereichen, einschlie ss lich des Gesundheitswesens, erwiesen. Der Einsatz von ChatGPT kann zu einer besseren Patientenaufkl & auml;rung und einem besseren Allgemeinwissen & uuml;ber das Gesundheitswesen beitragen. Jedoch existieren nur wenige Daten & uuml;ber die Zuverl & auml;ssigkeit der von ChatGPT generierten Antworten im Zusammenhang mit spezifischen medizinischen Diagnosen.MethodenZun & auml;chst wurden 12 von Patienten h & auml;ufig gestellte Fragen zur glenohumeralen Osteoarthritis ermittelt. Diese Fragen wurden sowohl auf Englisch als auch auf Deutsch formuliert, wobei allgemeine und medizinische Begriffe f & uuml;r die Erkrankung verwendet wurden, sodass 4 Gruppen gebildet wurden. Die Fragen wurden dann in ChatGPT 3.5 eingegeben. Die generierten Antworten wurden von 4 unabh & auml;ngigen Fach & auml;rzten f & uuml;r Orthop & auml;die und Unfallchirurgie anhand einer Likert-Skala (0 = v & ouml;llig ungenau bis 4 = v & ouml;llig zutreffend) auf ihre Richtigkeit hin bewertet.ErgebnisseObwohl 2 Fragen in 2 Gruppen vorhanden waren, wurden alle Fragen & uuml;ber alle Versionen hinweg mit guter Genauigkeit von ChatGPT 3.5 beantwortet. Der h & ouml;chste Wert auf der Likert-Skala wurde mit 3,9 in der Gruppe erreicht, in der die Fragen auf Englisch gestellt wurden, wobei der medizinische Begriff ,,glenohumeral osteoarthritis" verwendet wurde. Den niedrigsten Wert erreichte mit 3,2 die Gruppe, in der die Fragen in englischer Sprache gestellt wurden und der gew & ouml;hnliche Begriff ,,shoulder arthrosis" verwendet wurde. Im Durchschnitt erhielten die Fragen in englischer Sprache einen Wert von 3,5 auf der Likert-Skala, etwas mehr als die Fragen in deutscher Sprache, die einen Wert von 3,4 erhielten.SchlussfolgerungChatGPT 3.5 kann bereits genaue Antworten auf h & auml;ufig gestellte Patientenfragen zur glenohumeralen Arthrose geben. Daher kann ChatGPT ein wertvolles Instrument f & uuml;r die Patientenkommunikation und -aufkl & auml;rung im Bereich der Orthop & auml;die sein. Weitere Studien m & uuml;ssen jedoch durchgef & uuml;hrt werden, um die Mechanismen und Auswirkungen von ChatGPT in diesem Bereich vollst & auml;ndig zu verstehen.

引用

页数：6

共 27 条

[1] Investigating the Use of ChatGpt as a Novel Method for Seeking Health Information: A Qualitative Approach [J].

Al Shboul M.K.I. ;

Alwreikat A. ;

Alotaibi F.A. .

Science and Technology Libraries, 2024, 43 (03) :225-234

[2] The promise of artificial intelligence: a review of the opportunities and challenges of artificial intelligence in healthcare [J].

Aung, Yuri Y. M. ;

Wong, David C. S. ;

Ting, Daniel S. W. .

BRITISH MEDICAL BULLETIN, 2021, 139 (01) :4-15

[3] Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum [J].

Ayers, John W. ;

Poliak, Adam ;

Dredze, Mark ;

Leas, Eric C. ;

Zhu, Zechariah ;

Kelley, Jessica B. ;

Faix, Dennis J. ;

Goodman, Aaron M. ;

Longhurst, Christopher A. ;

Hogarth, Michael ;

Smith, Davey M. .

JAMA INTERNAL MEDICINE, 2023, 183 (06) :589-596

[4] The Ottawa panel clinical practice guidelines for the management of knee osteoarthritis. Part two: strengthening exercise programs [J].

Brosseau, Lucie ;

Taki, Jade ;

Desjardins, Brigit ;

Thevenot, Odette ;

Fransen, Marlene ;

Wells, George A. ;

Imoto, Aline Mizusaki ;

Toupin-April, Karine ;

Westby, Marie ;

Alvarez Gallardo, Inmaculada C. ;

Gifford, Wendy ;

Laferriere, Lucie ;

Rahman, Prinon ;

Loew, Laurianne ;

De Angelis, Gino ;

Cavallo, Sabrina ;

Shallwani, Shirin Mehdi ;

Aburub, Ala' ;

Bennell, Kim L. ;

Van der Esch, Martin ;

Simic, Milena ;

McConnell, Sara ;

Harmer, Alison ;

Kenny, Glen P. ;

Paterson, Gail ;

Regnaux, Jean-Philippe ;

Lefevre-Colau, Marie-Martine ;

McLean, Linda .

CLINICAL REHABILITATION, 2017, 31 (05) :596-611

[5]

Garg RK, 2023, HEALTH PROMOT PERSPE, V13, P183, DOI 10.34172/hpp.2023.22

[6] How Does ChatGPT Perform on the United States Medical Licensing Examination (USMLE)? The Implications of Large Language Models for Medical Education and Knowledge Assessment [J].

Gilson, Aidan ;

Safranek, Conrad W. ;

Huang, Thomas ;

Socrates, Vimig ;

Chi, Ling ;

Taylor, Richard Andrew ;

Chartash, David .

JMIR MEDICAL EDUCATION, 2023, 9

[7] Enhancing Patient Communication With Chat-GPT in Radiology: Evaluating the Efficacy and Readability of Answers to Common Imaging-Related Questions [J].

Gordon, Emile B. ;

Towbin, Alexander J. ;

Wingrove, Peter ;

Sha, Umber ;

Haas, Brian ;

Kitts, Andrea B. ;

Feldman, Jill ;

Furlan, Alessandro .

JOURNAL OF THE AMERICAN COLLEGE OF RADIOLOGY, 2024, 21 (02) :353-359

[8] Evaluating ChatGPT responses to frequently asked patient questions regarding periprosthetic joint infection after total hip and knee arthroplasty [J].

Hu, Xiaojun ;

Niemann, Marcel ;

Kienzle, Arne ;

Braun, Karl ;

Back, David Alexander ;

Gwinner, Clemens ;

Renz, Nora ;

Stoeckle, Ulrich ;

Trampuz, Andrej ;

Meller, Sebastian .

DIGITAL HEALTH, 2024, 10

[9] Glenohumeral Osteoarthritis: An Overview of Etiology and Diagnostics [J].

Ibounig, T. ;

Simons, T. ;

Launonen, A. ;

Paavola, M. .

SCANDINAVIAN JOURNAL OF SURGERY, 2021, 110 (03) :441-451

[10] ChatGPT Passes German State Examination in Medicine With Picture Questions Omitted [J].

Jung, Leonard B. ;

Gudera, Jonas A. ;

Wiegand, Tim L. T. ;

Allmendinger, Simeon ;

Dimitriadis, Konstantinos ;

Koerte, Inga K. .

DEUTSCHES ARZTEBLATT INTERNATIONAL, 2023, 120 (21-22) :373-374

← 1 2 3 →