Evaluation of ChatGPT in Predicting 6-Month Outcomes After Traumatic Brain Injury

被引:7
作者
Gakuba, Clement [1 ,2 ]
Le Barbey, Charlene [1 ]
Sar, Alexandre [1 ]
Bonnet, Gregory [3 ]
Cerasuolo, Damiano [4 ,5 ]
Giabicani, Mikhael [6 ]
Moyer, Jean-Denis [1 ]
机构
[1] CHU Caen Normandie, Dept Anesthesiol & Crit Care Med, Caen, France
[2] Normandie Univ, Inst Blood & Brain Caen Normandie, INSERM, U1237,UNICAEN,PhIND Physiopathol & imaging Neurol, Caen, France
[3] Normandie Univ, UNICAEN, CNRS, ENSICAEN ,Dept Grp Rech Informat Image & Instrumen, Caen, France
[4] CHU Caen Normandie, Dept Publ Hlth, Caen, France
[5] Normandie Univ, UNICAEN, INSERM U1086, ANTICIPE, Caen, France
[6] Beaujon Hosp, AP HP Nord, Dept Anaesthesiol & Crit Care, DMU Parabol, Paris, France
关键词
artificial intelligence; ChatGPT; neurologic outcomes; prediction; traumatic brain injury; PROGNOSIS; VALIDATION;
D O I
10.1097/CCM.0000000000006236
中图分类号
R4 [临床医学];
学科分类号
1002 ; 100602 ;
摘要
OBJECTIVES: To evaluate the capacity of ChatGPT, a widely accessible and uniquely popular artificial intelligence-based chatbot, in predicting the 6-month outcome following moderate-to-severe traumatic brain injury (TBI). DESIGN: Single-center observational retrospective study. SETTING: Data are from a neuro-ICU from a level 1 trauma center. PATIENTS: All TBI patients admitted to ICU between September 2021 and October 2022 were included in a prospective database. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULTS: Based on anonymized clinical, imaging, and biological information available at the patients' hospital admission and extracted from the database, clinical vignettes were retrospectively submitted to ChatGPT for prediction of patients' outcomes. The predictions of two intensivists (one neurointensivist and one non-neurointensivist) both from another level 1 trauma center (Beaujon Hospital), were also collected as was the International Mission on Prognosis and Analysis of Clinical Trials in Traumatic Brain Injury (IMPACT) scoring. Each intensivist, as well as ChatGPT, made their prognostic evaluations independently, without knowledge of the others' predictions and of the patients' actual management and outcome. Both the intensivists and ChatGPT were given access to the exact same set of information. The main outcome was a 6-month-functional status dichotomized into favorable (Glasgow Outcome Scale Extended [GOSE] >= 5) versus poor (GOSE < 5). Prediction of intracranial hypertension management, pulmonary infectious risk, and removal of life-sustaining therapies was also investigated as secondary outcomes. Eighty consecutive moderate-to-severe TBI patients were included. For the 6-month outcome prognosis, area under the receiver operating characteristic curve (AUC-ROC) for ChatGPT, the neurointensivist, the non-neurointensivist, and IMPACT were, respectively, 0.62 (0.50-0.74), 0.70 (0.59-0.82), 0.71 (0.59-0.82), and 0.81 (0.72-0.91). ChatGPT had the highest sensitivity (100%), but the lowest specificity (26%). For secondary outcomes, ChatGPT's prognoses were generally less accurate than clinicians' prognoses, with lower AUC values for most outcomes. CONCLUSIONS: This study does not support the use of ChatGPT for prediction of outcomes after TBI.
引用
收藏
页码:942 / 950
页数:9
相关论文
共 18 条
  • [1] PREDICT-TBI: Comparison of Physician Predictions with the IMPACT Model to Predict 6-Month Functional Outcome in Traumatic Brain Injury
    Amzallag, Juliette
    Ropers, Jacques
    Shotar, Eimad
    Mathon, Bertrand
    Jacquens, Alice
    Degos, Vincent
    Bernard, Remy
    [J]. NEUROCRITICAL CARE, 2023, 39 (02) : 455 - 463
  • [2] Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum
    Ayers, John W.
    Poliak, Adam
    Dredze, Mark
    Leas, Eric C.
    Zhu, Zechariah
    Kelley, Jessica B.
    Faix, Dennis J.
    Goodman, Aaron M.
    Longhurst, Christopher A.
    Hogarth, Michael
    Smith, Davey M.
    [J]. JAMA INTERNAL MEDICINE, 2023, 183 (06) : 589 - 596
  • [3] Potentials and pitfalls of ChatGPT and natural-language artificial intelligence models for the understanding of laboratory medicine test results. An assessment by the European Federation of Clinical Chemistry and Laboratory Medicine (EFLM) Working Group on Artificial Intelligence (WG-AI)
    Cadamuro, Janne
    Cabitza, Federico
    Debeljak, Zeljko
    De Bruyne, Sander
    Frans, Glynis
    Perez, Salomon Martin
    Ozdemir, Habib
    Tolios, Alexander
    Carobene, Anna
    Padoan, Andrea
    [J]. CLINICAL CHEMISTRY AND LABORATORY MEDICINE, 2023, 61 (07) : 1158 - 1166
  • [4] Chen LJ, 2023, Arxiv, DOI [arXiv:2307.09009, 10.48550/arXiv.2307.09009]
  • [5] 2023, Arxiv, DOI arXiv:2302.12692
  • [6] Collins GS, 2015, ANN INTERN MED, V162, P55, DOI [10.1016/j.jclinepi.2014.11.010, 10.7326/M14-0697, 10.1186/s12916-014-0241-z, 10.1002/bjs.9736, 10.7326/M14-0698, 10.1038/bjc.2014.639, 10.1016/j.eururo.2014.11.025, 10.1136/bmj.g7594]
  • [7] Prognosis for acute brain injury: Nobody's Perfect
    Gakuba, Clement
    Launey, Yoann
    Quintardd, Herve
    [J]. ANAESTHESIA CRITICAL CARE & PAIN MEDICINE, 2021, 40 (06)
  • [8] Gilson Aidan, 2023, JMIR Med Educ, V9, pe45312, DOI 10.2196/45312
  • [9] How Chatbots and Large Language Model Artificial Intelligence Systems Will Reshape Modern Medicine Fountain of Creativity or Pandora's Box?
    Li, Ron
    Kumar, Andre
    Chen, Jonathan H.
    [J]. JAMA INTERNAL MEDICINE, 2023, 183 (06) : 596 - 597
  • [10] Prognosis in moderate and severe traumatic brain injury: External validation of the IMPACT models and the role of extracranial injuries
    Lingsma, Hester
    Andriessen, Teuntje M. J. C.
    Haitsema, Iain
    Horn, Janneke
    van der Naalt, Joukje
    Franschman, Gaby
    Maas, Andrew I. R.
    Vos, Pieter E.
    Steyerberg, Ewout W.
    [J]. JOURNAL OF TRAUMA AND ACUTE CARE SURGERY, 2013, 74 (02) : 639 - 646