Strategies for handling missing clinical data for automated surgical site infection detection from the electronic health record

被引:64
作者
Hu, Zhen [1 ]
Melton, Genevieve B. [1 ,2 ]
Arsoniadis, Elliot G. [1 ,2 ]
Wang, Yan [1 ]
Kwaan, Mary R. [2 ]
Simon, Gyorgy J. [1 ,3 ]
机构
[1] Univ Minnesota, Inst Hlth Informat, 420 Delaware St SE,MMC 912, Minneapolis, MN 55455 USA
[2] Univ Minnesota, Dept Surg, Box 242 UMHC, Minneapolis, MN 55455 USA
[3] Univ Minnesota, Dept Med, 420 Delaware St SE,MMC 912, Minneapolis, MN 55455 USA
基金
美国医疗保健研究与质量局; 美国国家卫生研究院;
关键词
Electronic health records; Surgical site infections; Missing data; MULTIVARIATE IMPUTATION; MULTIPLE IMPUTATION; COLORECTAL SURGERY; QUALITY; CARE; IMPROVEMENT; NETWORK; IMPACT; COST;
D O I
10.1016/j.jbi.2017.03.009
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Proper handling of missing data is important for many secondary uses of electronic health record (EHR) data. Data imputation methods can be used to handle missing data, but their use for analyzing EHR data is limited and specific efficacy for postoperative complication detection is unclear. Several data imputation methods were used to develop data models for automated detection of three types (i.e., superficial, deep, and organ space) of surgical site infection (SSI) and overall SSI using American College of Surgeons National Surgical Quality Improvement Project (NSQIP) Registry 30-day SSI occurrence data as a reference standard. Overall, models with missing data imputation almost always outperformed reference models without imputation that included only cases with complete data for detection of SSI overall achieving very good average area under the curve values. Missing data imputation appears to be an effective means for improving postoperative SSI detection using EHR clinical data. (C) 2017 Elsevier Inc. All rights reserved.
引用
收藏
页码:112 / 120
页数:9
相关论文
共 39 条
[1]  
ACS NSQIP, PROGR OV
[2]  
[Anonymous], 1987, Statistical analysis with missing data
[3]  
[Anonymous], 2013, J AM COLL SURG
[4]   Multiple imputation by chained equations: what is it and how does it work? [J].
Azur, Melissa J. ;
Stuart, Elizabeth A. ;
Frangakis, Constantine ;
Leaf, Philip J. .
INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) :40-49
[5]  
Birkhead G.S., 2015, Frontiers in Public Health Services and Systems Research, V4, P25, DOI DOI 10.13023/FPI1SSR.0405.05
[6]   The use of the area under the roc curve in the evaluation of machine learning algorithms [J].
Bradley, AP .
PATTERN RECOGNITION, 1997, 30 (07) :1145-1159
[7]  
Carpenter JR., Missing Data in Randomised Controlled Trials: A Practical Guide
[8]   Electronic Health Records and Quality of Diabetes Care [J].
Cebul, Randall D. ;
Love, Thomas E. ;
Jain, Anil K. ;
Hebert, Christopher J. .
NEW ENGLAND JOURNAL OF MEDICINE, 2011, 365 (09) :825-833
[9]  
Chan Kitty S, 2010, Med Care Res Rev, V67, P503, DOI 10.1177/1077558709359007
[10]   Optimizing ACS NSQIP Modeling for Evaluation of Surgical Quality and Risk: Patient Risk Adjustment, Procedure Mix Adjustment, Shrinkage Adjustment, and Surgical Focus [J].
Cohen, Mark E. ;
Ko, Clifford Y. ;
Bilimoria, Karl Y. ;
Zhou, Lynn ;
Huffman, Kristopher ;
Wang, Xue ;
Liu, Yaoming ;
Kraemer, Kari ;
Meng, Xiangju ;
Merkow, Ryan ;
Chow, Warren ;
Matel, Brian ;
Richards, Karen ;
Hart, Amy J. ;
Dimick, Justin B. ;
Hall, Bruce L. .
JOURNAL OF THE AMERICAN COLLEGE OF SURGEONS, 2013, 217 (02) :336-+