Utilizing Artificial Intelligence and Chat Generative Pretrained Transformer to Answer Questions About Clinical Scenarios in Neuroanesthesiology

被引:4
作者
Blacker, Samuel N. [1 ]
Kang, Mia [1 ]
Chakraborty, Indranil [2 ]
Chowdhury, Tumul [3 ]
Williams, James [1 ]
Lewis, Carol [1 ]
Zimmer, Michael [1 ]
Wilson, Brad [1 ]
Lele, Abhijit V. [4 ]
机构
[1] Univ North Carolina Chapel Hill, Dept Anesthesiol, Chapel Hill, NC 27599 USA
[2] Univ Arkansas, Dept Anesthesiol, Little Rock, AR USA
[3] Univ Toronto, Dept Anesthesiol, Toronto, ON, Canada
[4] Univ Washington, Dept Anesthesiol, Seattle, WA USA
关键词
artificial intelligence; ChatGPT; clinical guideline applications; neuroanesthesia; language processing; ANESTHESIOLOGY; NEUROSCIENCE; ASSOCIATION; GUIDELINES; MANAGEMENT; SOCIETY; STROKE; CARE;
D O I
10.1097/ANA.0000000000000949
中图分类号
R614 [麻醉学];
学科分类号
100217 ;
摘要
Objective:We tested the ability of chat generative pretrained transformer (ChatGPT), an artificial intelligence chatbot, to answer questions relevant to scenarios covered in 3 clinical guidelines, published by the Society for Neuroscience in Anesthesiology and Critical Care (SNACC), which has published management guidelines: endovascular treatment of stroke, perioperative stroke (Stroke), and care of patients undergoing complex spine surgery (Spine).Methods:Four neuroanesthesiologists independently assessed whether ChatGPT could apply 52 high-quality recommendations (HQRs) included in the 3 SNACC guidelines. HQRs were deemed present in the ChatGPT responses if noted by at least 3 of the 4 reviewers. Reviewers also identified incorrect references, potentially harmful recommendations, and whether ChatGPT cited the SNACC guidelines.Results:The overall reviewer agreement for the presence of HQRs in the ChatGPT answers ranged from 0% to 100%. Only 4 of 52 (8%) HQRs were deemed present by at least 3 of the 4 reviewers after 5 generic questions, and 23 (44%) HQRs were deemed present after at least 1 additional targeted question. Potentially harmful recommendations were identified for each of the 3 clinical scenarios and ChatGPT failed to cite the SNACC guidelines.Conclusions:The ChatGPT answers were open to human interpretation regarding whether the responses included the HQRs. Though targeted questions resulted in the inclusion of more HQRs than generic questions, fewer than 50% of HQRs were noted even after targeted questions. This suggests that ChatGPT should not currently be considered a reliable source of information for clinical decision-making. Future iterations of ChatGPT may refine algorithms to improve its reliability as a source of clinical information.
引用
收藏
页码:346 / 351
页数:6
相关论文
共 17 条
  • [11] Performance of an Artificial Intelligence Chatbot in Ophthalmic Knowledge Assessment
    Mihalache, Andrew
    Popovic, Marko M.
    Muni, Rajeev H.
    [J]. JAMA OPHTHALMOLOGY, 2023, 141 (06) : 589 - 597
  • [12] Nori H., 2023, ARXIV, DOI DOI 10.48550/ARXIV.2303.13375
  • [13] OpenAI, GPT-4TechnicalReport
  • [14] 2015 ACC/AHA/HRS Guideline for the Management of Adult Patients With Supraventricular Tachycardia: Executive Summary A Report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Heart Rhythm Society
    Page, Richard L.
    Joglar, Jose A.
    Caldwell, Mary A.
    Calkins, Hugh
    Conti, Jamie B.
    Deal, Barbara J.
    Estes, N. A. Mark, III
    Field, Michael E.
    Goldberger, Zachary D.
    Hammill, Stephen C.
    Indik, Julia H.
    Lindsay, Bruce D.
    Olshansky, Brian
    Russo, Andrea M.
    Shen, Win-Kuang
    Tracy, Cynthia M.
    Al-Khatib, Sana M.
    Halperin, Jonathan L.
    Levine, Glenn N.
    Anderson, Jeffrey L.
    Albert, Nancy M.
    Al-Khatib, Sana M.
    Birtcher, Kim K.
    Bozkurt, Biykem
    Brindis, Ralph G.
    Cigarroa, Joaquin E.
    Curtis, Lesley H.
    Fleisher, Lee A.
    Gentile, Federico
    Gidding, Samuel
    Hlatky, Mark A.
    Ikonomidis, John
    Joglar, Jose
    Kovacs, Richard J.
    Ohman, E. Magnus
    Pressler, Susan J.
    Sellke, Frank W.
    Shen, Win-Kuang
    Wijeysundera, Duminda N.
    [J]. CIRCULATION, 2016, 133 (14) : E471 - E505
  • [15] Anesthetic Management of Endovascular Treatment of Acute Ischemic Stroke During COVID-19 Pandemic: Consensus Statement From Society for Neuroscience in Anesthesiology & Critical Care (SNACC) Endorsed by Society of Vascular & Interventional Neurology (SVIN), Society of NeuroInterventional Surgery (SNIS), Neurocritical Care Society (NCS), European Society of Minimally Invasive Neurological Therapy (ESMINT) and American Association of Neurological Surgeons (AANS) and Congress of Neurological Surgeons (CNS) Cerebrovascular Section
    Sharma, Deepak
    Rasmussen, Mads
    Han, Ruquan
    Whalin, Matthew K.
    Davis, Melinda
    Kofke, W. Andrew
    Venkatraghvan, Lakshmikumar
    Raychev, Radoslav
    Fraser, Justin F.
    [J]. JOURNAL OF NEUROSURGICAL ANESTHESIOLOGY, 2020, 32 (03) : 193 - 201
  • [16] Chat Generative Pretrained Transformer Fails the Multiple-Choice American College of Gastroenterology Self-Assessment Test
    Suchman, Kelly
    Garg, Shashank
    Trindade, Arvind J.
    [J]. AMERICAN JOURNAL OF GASTROENTEROLOGY, 2023, 118 (12) : 2280 - 2282
  • [17] Perioperative Care of Patients at High Risk for Stroke During or After Non-cardiac, Non-neurological Surgery: 2020 Guidelines From the Society for Neuroscience in Anesthesiology and Critical Care
    Vlisides, Phillip E.
    Moore, Laurel E.
    Whalin, Matthew K.
    Robicsek, Steven A.
    Gelb, Adrian W.
    Lele, Abhijit, V
    Mashour, George A.
    [J]. JOURNAL OF NEUROSURGICAL ANESTHESIOLOGY, 2020, 32 (03) : 210 - 226