Large Language Model (LLM)-Powered Chatbots Fail to Generate Guideline-Consistent Content on Resuscitation and May Provide Potentially Harmful Advice

被引:14
作者
Birkun, Alexei A. [1 ,3 ]
Gautam, Adhish [2 ]
机构
[1] SI Georgievsky VI Vernadsky Crimean Fed Univ, Dept Gen Surg Anesthesiol Resuscitat & Emergency M, Simferopol 295051, Russia
[2] Reg Govt Hosp, Una 174303, HP, India
[3] SIGeorgievsky VI Vernadsky Crimean Fed Univ, Lenin Blvd 5-7, Simferopol 295051, Russia
关键词
artificial hallucination; artificial intelligence; cardiac arrest; cardiopulmonary resuscitation; chatbot; large language model; CARDIOPULMONARY-RESUSCITATION; AVAILABILITY; SYSTEMS; SUPPORT;
D O I
10.1017/S1049023X23006568
中图分类号
R4 [临床医学];
学科分类号
1002 ; 100602 ;
摘要
Introduction: Innovative large language model (LLM)-powered chatbots, which are extremely popular nowadays, represent potential sources of information on resuscitation for the general public. For instance, the chatbot-generated advice could be used for purposes of community resuscitation education or for just-in-time informational support of untrained lay rescuers in a real-life emergency.Study Objective: This study focused on assessing performance of two prominent LLM-based chatbots, particularly in terms of quality of the chatbot-generated advice on how to give help to a non-breathing victim.Methods: In May 2023, the new Bing (Microsoft Corporation, USA) and Bard (Google LLC, USA) chatbots were inquired (n = 20 each): "What to do if someone is not breathing?" Content of the chatbots' responses was evaluated for compliance with the 2021 Resuscitation Council United Kingdom guidelines using a pre-developed checklist.Results: Both chatbots provided context-dependent textual responses to the query. However, coverage of the guideline-consistent instructions on help to a non-breathing victim within the responses was poor: mean percentage of the responses completely satisfying the checklist criteria was 9.5% for Bing and 11.4% for Bard (P >.05). Essential elements of the bystander action, including early start and uninterrupted performance of chest compressions with adequate depth, rate, and chest recoil, as well as request for and use of an automated external defibrillator (AED), were missing as a rule. Moreover, 55.0% of Bard's responses contained plausible sounding, but nonsensical guidance, called artificial hallucinations, that create risk for inadequate care and harm to a victim.Conclusion: The LLM-powered chatbots' advice on help to a non-breathing victim omits essential details of resuscitation technique and occasionally contains deceptive, potentially harmful directives. Further research and regulatory measures are required to mitigate risks related to the chatbot-generated misinformation of public on resuscitation.
引用
收藏
页码:757 / 763
页数:7
相关论文
共 27 条
  • [2] Artificial Hallucinations in ChatGPT: Implications in Scientific Writing
    Alkaissi, Hussam
    McFarlane, Samy I.
    [J]. CUREUS JOURNAL OF MEDICAL SCIENCE, 2023, 15 (02)
  • [3] Altamimi I., 2023, Cureus, V15, pe40351
  • [4] Bard, Bard FAQ
  • [5] Part 7: Systems of Care 2020 American Heart Association Guidelines for Cardiopulmonary Resuscitation and Emergency Cardiovascular Care
    Berg, Katherine M.
    Cheng, Adam
    Panchal, Ashish R.
    Topjian, Alexis A.
    Aziz, Khalid
    Bhanji, Farhan
    Bigham, Blair L.
    Hirsch, Karen G.
    Hoover, Amber V.
    Kurz, Michael C.
    Levy, Arielle
    Lin, Yiqun
    Magid, David J.
    Mahgoub, Melissa
    Peberdy, Mary Ann
    Rodriguez, Amber J.
    Sasson, Comilla
    Lavonas, Eric J.
    [J]. CIRCULATION, 2020, 142 : S580 - S604
  • [6] Patient and Consumer Safety Risks When Using Conversational Assistants for Medical Information: An Observational Study of Siri, Alexa, and Google Assistant
    Bickmore, Timothy W.
    Trinh, Ha
    Olafsson, Stefan
    O'Leary, Teresa K.
    Asadi, Reza
    Rickles, Nathaniel M.
    Cruz, Ricardo
    [J]. JOURNAL OF MEDICAL INTERNET RESEARCH, 2018, 20 (09)
  • [7] Bing, Introducing the new Bing
  • [8] Birkun A., 2023, Mendeley Data, pV1
  • [9] Open online courses on basic life support: Availability and resuscitation guidelines compliance
    Birkun, Alexei
    Gautam, Adhish
    Trunkwala, Fatima
    Boettiger, Bernd W.
    [J]. AMERICAN JOURNAL OF EMERGENCY MEDICINE, 2022, 62 : 102 - 107
  • [10] Dr. Google's Advice on First Aid: Evaluation of the Search Engine's Question-Answering System Responses to Queries Seeking Help in Health Emergencies
    Birkun, Alexei A.
    Gautam, Adhish
    [J]. PREHOSPITAL AND DISASTER MEDICINE, 2023, 38 (03) : 345 - 351