HOLMES: Health OnLine Model Ensemble Serving for Deep Learning Models in Intensive Care Units

被引:55
作者
Hong, Shenda [1 ]
Xu, Yanbo [1 ]
Khare, Alind [1 ]
Priambada, Satria [1 ]
Maher, Kevin [2 ]
Aljiffry, Alaa [2 ]
Sun, Jimeng [3 ]
Tumanov, Alexey [1 ]
机构
[1] Georgia Inst Technol, Atlanta, GA 30332 USA
[2] Childrens Healthcare Atlanta, Atlanta, GA USA
[3] Univ Illinois, Urbana, IL USA
来源
KDD '20: PROCEEDINGS OF THE 26TH ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING | 2020年
基金
美国国家科学基金会;
关键词
Healthcare; Health Informatics; Data Mining System; Software; ALGORITHM; COSTS;
D O I
10.1145/3394486.3403212
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep learning models have achieved expert-level performance in healthcare with an exclusive focus on training accurate models. However, in many clinical environments such as intensive care unit (ICU), real-time model serving is equally if not more important than accuracy, because in ICU patient care is simultaneously more urgent and more expensive. Clinical decisions and their timeliness, therefore, directly affect both the patient outcome and the cost of care. To make timely decisions, we argue the underlying serving system must be latency-aware. To compound the challenge, health analytic applications often require a combination of models instead of a single model, to better specialize individual models for different targets, multi-modal data, different prediction windows, and potentially personalized predictions To address these challenges, we propose HOLMES-an online model ensemble serving framework for healthcare applications. HOLMES dynamically identifies the best performing set of models to ensemble for highest accuracy, while also satisfying sub-second latency constraints on end-to-end prediction. We demonstrate that HOLMES is able to navigate the accuracy/latency tradeoff efficiently, compose the ensemble, and serve the model ensemble pipeline, scaling to simultaneously streaming data from 100 patients, each producing waveform data at 250 Hz. HOLMES outperforms the conventional offline batch-processed inference for the same clinical task in terms of accuracy and latency (by order of magnitude). HOLMES is tested on risk prediction task on pediatric cardio ICU data with above 95% prediction accuracy and sub-second latency on 64-bed simulation.
引用
收藏
页码:1614 / 1624
页数:11
相关论文
共 38 条
[1]  
[Anonymous], 2012, BAYESIAN APPROACH GL
[2]  
[Anonymous], 2015, ARXIV150506807CSLG
[3]  
[Anonymous], 2016, P 4 INT C LEARN REPR
[4]   What's new in ICU in 2050: big data and machine learning [J].
Bailly, Sebastien ;
Meyfroidt, Geert ;
Timsit, Jean-Francois .
INTENSIVE CARE MEDICINE, 2018, 44 (09) :1524-1527
[5]  
BengioY KeglB, 2011, ADV NEURAL INFORM PR, P2546
[6]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[7]  
Bergstra James, 2013, JMLR
[8]  
Breiman L., 2001, IEEE Trans. Broadcast., V45, P5
[9]   "Big Data" in the Intensive Care Unit Closing the Data Loop [J].
Celi, Leo Anthony ;
Mark, Roger G. ;
Stone, David J. ;
Montgomery, Robert A. .
AMERICAN JOURNAL OF RESPIRATORY AND CRITICAL CARE MEDICINE, 2013, 187 (11) :1157-1160
[10]  
Crankshaw Daniel, 2016, ABS161203079 CORR