Prediction of severe chest injury using natural language processing from the electronic health record

被引:20
作者
Kulshrestha, Sujay [1 ,2 ]
Dligach, Dmitriy [3 ,4 ,5 ]
Joyce, Cara [3 ,4 ]
Baker, Marshall S. [2 ,6 ]
Gonzalez, Richard [1 ,2 ]
O'Rourke, Ann P. [7 ]
Glazer, Joshua M. [8 ]
Stey, Anne [9 ]
Kruser, Jacqueline M. [10 ,11 ]
Churpek, Matthew M. [12 ]
Afshar, Majid [3 ,13 ]
机构
[1] Loyola Univ Chicago, Burn & Shock Trauma Res Inst, CTRE Bldg 115,Room 315,2160 South 1st Ave, Maywood, IL USA
[2] Loyola Univ Med Ctr, Dept Surg, EMS Bldg 110,Room 3210,2160 South 1st Ave, Maywood, IL 60153 USA
[3] Loyola Univ Chicago, Ctr Hlth Outcomes & Informat Res, Hlth Sci Div, CTRE Bldg 115,Room 126,2160 South 1st Ave, Maywood, IL USA
[4] Loyola Univ Chicago, Stritch Sch Med, Dept Publ Hlth Sci, 2160 South 1st Ave, Maywood, IL USA
[5] Loyola Univ Chicago, Dept Comp Sci, 1052 West Loyola Ave, Chicago, IL USA
[6] Vet Affairs Hosp, 5000 South Fifth Ave, Hines, IL USA
[7] Univ Wisconsin, Dept Surg, 600 Highland Ave,MC 3236, Madison, WI USA
[8] Univ Wisconsin, Dept Emergency Med, 800 Univ Bay Dr,Suite 310,MC 9123, Madison, WI USA
[9] Northwestern Univ, Dept Surg, Div Trauma & Surg Crit Care, 76 North St Clair St,Suite 650, Chicago, IL USA
[10] Northwestern Univ, Dept Med, Div Pulm & Crit Care, 633 North St Clair St,20th Floor,McGaw M-335, Chicago, IL 60611 USA
[11] Northwestern Univ, Dept Med Social Sci, 633 North St Clair St,19th Floor, Chicago, IL 60611 USA
[12] Univ Wisconsin, Dept Med, 8007 Excelsior Dr, Madison, WI USA
[13] Loyola Univ Chicago, Dept Hlth Informat & Data Sci, 2160 South First Ave, Maywood, IL USA
来源
INJURY-INTERNATIONAL JOURNAL OF THE CARE OF THE INJURED | 2021年 / 52卷 / 02期
基金
美国国家卫生研究院;
关键词
Trauma; Machine learning; Natural language processing; Trauma registry; TRAUMA PATIENTS; SCALE; INFORMATION; MECHANISM; BENEFIT; TEXT;
D O I
10.1016/j.injury.2020.10.094
中图分类号
R4 [临床医学];
学科分类号
1002 ; 100602 ;
摘要
Introduction: Trauma injury severity scores are currently calculated retrospectively from the electronic health record (EHR) using manual annotation by certified trauma coders. Natural language processing (NLP) of clinical documents in the EHR may enable automated injury scoring. We hypothesize that NLP with machine learning can discriminate between cases of severe and non-severe injury to the thorax after trauma. Methods: Clinical documents from a trauma center were examined between 2014 and 2018. Severe chest injury was defined as a thorax abbreviated injury score (AIS) >2 and served as the reference standard for supervised learning. Free text unigrams and concept unique identifiers (CUIs) from the Unified Medical Language Systems (UMLS) were extracted from clinical documents collected at one hour, four hours, and eight hours after patient arrival to the emergency department. Logistic regression models with elastic net regularization were tuned to maximize area under the receiver operating characteristic curve (AUROC) using 10-fold cross-validation on the training dataset (80%) and tested on a hold-out 20% dataset. Results: There were 6,891 traumas that met inclusion criteria. The complete data corpus consisted of 473,694 documents. Models trained using the first hour of data had a mean AUROC of 0.88 (95%CI [0.86, 0.89]); model discrimination and reclassification from the first hour significantly improved after eight hours with a mean AUROC of 0.94 (95%CI [0.93, 0.95]). Performance of models using CUIs were similar to unigrams (p>0.05). Models demonstrated excellent clinical face validity. Conclusions: Both CUIs and unigrams demonstrated excellent discrimination in predicting severity of chest injury using the first eight hours of clinical documents. Our model demonstrates that automated anatomical injury scoring is feasible and may be used for aggregation of data for trauma research and quality programs. (C) 2020 Elsevier Ltd. All rights reserved.
引用
收藏
页码:205 / 212
页数:8
相关论文
共 36 条
[1]   Development and application of a high throughput natural language processing architecture to convert all clinical documents in a clinical data warehouse into standardized medical vocabularies [J].
Afshar, Majid ;
Dligach, Dmitriy ;
Sharma, Brihat ;
Cai, Xiaoyuan ;
Boyda, Jason ;
Birch, Steven ;
Valdez, Daniel ;
Zelisko, Suzan ;
Joyce, Cara ;
Modave, Francois ;
Price, Ron .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2019, 26 (11) :1364-1369
[2]   Natural language processing and machine learning to identify alcohol misuse from the electronic health record in trauma patients: development and internal validation [J].
Afshar, Majid ;
Phillips, Andrew ;
Karnik, Niranjan ;
Mueller, Jeanne ;
To, Daniel ;
Gonzalez, Richard ;
Price, Ron ;
Cooper, Richard ;
Joyce, Cara ;
Dligach, Dmitriy .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2019, 26 (03) :254-261
[3]   The Unified Medical Language System (UMLS): integrating biomedical terminology [J].
Bodenreider, O .
NUCLEIC ACIDS RESEARCH, 2004, 32 :D267-D270
[4]  
Castro VM, 2017, NEUROLOGY, V88, P164, DOI 10.1212/WNL.0000000000003490
[5]  
Centers for Disease Control and Prevention, 2011, CHOICE REV, V48, P48
[6]   PROGRESS IN CHARACTERIZING ANATOMIC INJURY [J].
COPES, WS ;
CHAMPION, HR ;
SACCO, WJ ;
LAWNICK, MM ;
GANN, DS ;
GENNARELLI, T ;
MACKENZIE, E ;
SCHWAITZBERG, S .
JOURNAL OF TRAUMA-INJURY INFECTION AND CRITICAL CARE, 1990, 30 (10) :1200-1207
[7]   PELVIC FRACTURE IN MULTIPLE TRAUMA - CLASSIFICATION BY MECHANISM IS KEY TO PATTERN OF ORGAN INJURY, RESUSCITATIVE REQUIREMENTS, AND OUTCOME [J].
DALAL, SA ;
BURGESS, AR ;
SIEGEL, JH ;
YOUNG, JW ;
BRUMBACK, RJ ;
POKA, A ;
DUNHAM, CM ;
GENS, D ;
BATHON, H .
JOURNAL OF TRAUMA-INJURY INFECTION AND CRITICAL CARE, 1989, 29 (07) :981-1002
[8]  
Day Suzanne, 2007, J Trauma Nurs, V14, P79
[9]   Factors Associated With the Disposition of Severely Injured Patients Initially Seen at Non-Trauma Center Emergency Departments Disparities by Insurance Status [J].
Delgado, M. Kit ;
Yokell, Michael A. ;
Staudenmayer, Kristan L. ;
Spain, David A. ;
Hernandez-Boussard, Tina ;
Wang, N. Ewen .
JAMA SURGERY, 2014, 149 (05) :422-430
[10]   COMPARING THE AREAS UNDER 2 OR MORE CORRELATED RECEIVER OPERATING CHARACTERISTIC CURVES - A NONPARAMETRIC APPROACH [J].
DELONG, ER ;
DELONG, DM ;
CLARKEPEARSON, DI .
BIOMETRICS, 1988, 44 (03) :837-845