An interpretable risk prediction model for healthcare with pattern attention

被引:17
作者
Kamal, Sundreen Asad [1 ]
Yin, Changchang [2 ]
Qian, Buyue [1 ]
Zhang, Ping [2 ,3 ]
机构
[1] Xi An Jiao Tong Univ, Dept Comp Sci & Technol, 28 Xianning West Rd, Xian 710049, Shaanxi, Peoples R China
[2] Ohio State Univ, Dept Comp Sci & Engn, 2015 Neil Ave, Columbus, OH 43210 USA
[3] Ohio State Univ, Dept Biomed Informat, 1800 Cannon Dr, Columbus, OH 43210 USA
基金
中国国家自然科学基金;
关键词
EHR; Risk prediction; Self-attention; Interpretability;
D O I
10.1186/s12911-020-01331-7
中图分类号
R-058 [];
学科分类号
摘要
BackgroundThe availability of massive amount of data enables the possibility of clinical predictive tasks. Deep learning methods have achieved promising performance on the tasks. However, most existing methods suffer from three limitations: (1) There are lots of missing value for real value events, many methods impute the missing value and then train their models based on the imputed values, which may introduce imputation bias. The models' performance is highly dependent on the imputation accuracy. (2) Lots of existing studies just take Boolean value medical events (e.g. diagnosis code) as inputs, but ignore real value medical events (e.g., lab tests and vital signs), which are more important for acute disease (e.g., sepsis) and mortality prediction. (3) Existing interpretable models can illustrate which medical events are conducive to the output results, but are not able to give contributions of patterns among medical events.MethodsIn this study, we propose a novel interpretable Pattern Attention model with Value Embedding (PAVE) to predict the risks of certain diseases. PAVE takes the embedding of various medical events, their values and the corresponding occurring time as inputs, leverage self-attention mechanism to attend to meaningful patterns among medical events for risk prediction tasks. Because only the observed values are embedded into vectors, we don't need to impute the missing values and thus avoids the imputations bias. Moreover, the self-attention mechanism is helpful for the model interpretability, which means the proposed model can output which patterns cause high risks.ResultsWe conduct sepsis onset prediction and mortality prediction experiments on a publicly available dataset MIMIC-III and our proprietary EHR dataset. The experimental results show that PAVE outperforms existing models. Moreover, by analyzing the self-attention weights, our model outputs meaningful medical event patterns related to mortality.ConclusionsPAVE learns effective medical event representation by incorporating the values and occurring time, which can improve the risk prediction performance. Moreover, the presented self-attention mechanism can not only capture patients' health state information, but also output the contributions of various medical event patterns, which pave the way for interpretable clinical risk predictions.AvailabilityThe code for this paper is available at: https://github.com/yinchangchang/PAVE.
引用
收藏
页数:10
相关论文
共 26 条
[1]  
[Anonymous], 2016, Machine Learning for Healthcare
[2]   Patient Subtyping via Time-Aware LSTM Networks [J].
Baytas, Inci M. ;
Xiao, Cao ;
Zhang, Xi ;
Wang, Fei ;
Jain, Anil K. ;
Zhou, Jiayu .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :65-74
[3]   Predicting life expectancy with a long short-term memory recurrent neural network using electronic medical records [J].
Beeksma, Merijn ;
Verberne, Suzan ;
van den Bosch, Antal ;
Das, Enny ;
Hendrickx, Iris ;
Groenewoud, Stef .
BMC MEDICAL INFORMATICS AND DECISION MAKING, 2019, 19 (1)
[4]   Recurrent Neural Networks for Multivariate Time Series with Missing Values [J].
Che, Zhengping ;
Purushotham, Sanjay ;
Cho, Kyunghyun ;
Sontag, David ;
Liu, Yan .
SCIENTIFIC REPORTS, 2018, 8
[5]   Deep Computational Phenotyping [J].
Che, Zhengping ;
Kale, David ;
Li, Wenzhe ;
Bahadori, Mohammad Taha ;
Liu, Yan .
KDD'15: PROCEEDINGS OF THE 21ST ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2015, :507-516
[6]  
Chen YY, 2016, PROC EUR S-STATE DEV, P432, DOI 10.1109/ESSDERC.2016.7599678
[7]  
Cho K., 2014, P SSST 8 8 WORKSH SY, DOI DOI 10.3115/V1/W14-4012
[8]   GRAM: Graph-based Attention Model for Healthcare Representation Learning [J].
Choi, Edward ;
Bahadori, Mohammad Taha ;
Song, Le ;
Stewart, Walter F. ;
Sun, Jimeng .
KDD'17: PROCEEDINGS OF THE 23RD ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2017, :787-795
[9]  
Choi E, 2016, ADV NEUR IN, V29
[10]   Using recurrent neural network models for early detection of heart failure onset [J].
Choi, Edward ;
Schuetz, Andy ;
Stewart, Walter F. ;
Sun, Jimeng .
JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION, 2017, 24 (02) :361-370