An explainable knowledge distillation method with XGBoost for ICU mortality prediction

被引:24
作者
Liu, Mucan [1 ]
Guo, Chonghui [1 ]
Guo, Sijia [1 ]
机构
[1] Dalian Univ Technol, Inst Syst Engn, Dalian 116024, Peoples R China
关键词
Intensive care units; Mortality prediction; Knowledge distillation; Explainable machine learning;
D O I
10.1016/j.compbiomed.2022.106466
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
Background and Objective: Mortality prediction is an important task in intensive care unit (ICU) for quantifying the severity of patients' physiological condition. Currently, scoring systems are widely applied for mortality prediction, while the performance is unsatisfactory in many clinical conditions due to the non -specificity and linearity characteristics of the used model. As the availability of the large volume of data recorded in electronic health records (EHRs), deep learning models have achieved state-of-art predictive performance. However, deep learning models are hard to meet the requirement of explainability in clinical conditions. Hence, an explainable Knowledge Distillation method with XGBoost (XGB-KD) is proposed to improve the predictive performance of XGBoost while supporting better explainability.Methods: In this method, we first use outperformed deep learning teacher models to learn the complex patterns hidden in high-dimensional multivariate time series data. Then, we distill knowledge from soft labels generated by the ensemble of teacher models to guide the training of XGBoost student model, whose inputs are meaningful features obtained from feature engineering. Finally, we conduct model calibration to obtain predicted probabilities reflecting the true posterior probabilities and use SHapley Additive exPlanations (SHAP) to obtain insights about the trained model.Results: We conduct comprehensive experiments on MIMIC-III dataset to evaluate our method. The results demonstrate that our method achieves better predictive performance than vanilla XGBoost, deep learn-ing models and several state-of-art baselines from related works. Our method can also provide intuitive explanations.Conclusions: Our method is useful for improving the predictive performance of XGBoost by distilling knowledge from deep learning models and can provide meaningful explanations for predictions.
引用
收藏
页数:13
相关论文
共 46 条
[1]  
ANCONA M., 2019, International Conference on Machine Learning, PMLR, P272
[2]   AN EMPIRICAL DISTRIBUTION FUNCTION FOR SAMPLING WITH INCOMPLETE INFORMATION [J].
AYER, M ;
BRUNK, HD ;
EWING, GM ;
REID, WT ;
SILVERMAN, E .
ANNALS OF MATHEMATICAL STATISTICS, 1955, 26 (04) :641-647
[3]   On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation [J].
Bach, Sebastian ;
Binder, Alexander ;
Montavon, Gregoire ;
Klauschen, Frederick ;
Mueller, Klaus-Robert ;
Samek, Wojciech .
PLOS ONE, 2015, 10 (07)
[4]  
Bahdanau D, 2016, Arxiv, DOI [arXiv:1409.0473, DOI 10.48550/ARXIV.1409.0473]
[5]   Predicting length of stay and mortality among hospitalized patients with type 2 diabetes mellitus and hypertension [J].
Barsasella, Diana ;
Gupta, Srishti ;
Malwade, Shwetambara ;
Aminin ;
Susanti, Yanti ;
Tirmadi, Budi ;
Mutamakin, Agus ;
Jonnagaddala, Jitendra ;
Syed-Abdul, Shabbir .
INTERNATIONAL JOURNAL OF MEDICAL INFORMATICS, 2021, 154
[6]   Machine learning explainability via microaggregation and shallow decision trees [J].
Blanco-Justicia, Alberto ;
Domingo-Ferrer, Josep ;
Martinez, Sergio ;
Sanchez, David .
KNOWLEDGE-BASED SYSTEMS, 2020, 194
[7]   ISeeU: Visually interpretable deep learning for mortality prediction inside the ICU [J].
Caicedo-Torres, William ;
Gutierrez, Jairo .
JOURNAL OF BIOMEDICAL INFORMATICS, 2019, 98
[8]   Recurrent Neural Networks for Multivariate Time Series with Missing Values [J].
Che, Zhengping ;
Purushotham, Sanjay ;
Cho, Kyunghyun ;
Sontag, David ;
Liu, Yan .
SCIENTIFIC REPORTS, 2018, 8
[9]  
Che Zhengping, 2016, AMIA Annu Symp Proc, V2016, P371
[10]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794