Patient2Vec: A Personalized Interpretable Deep Representation of the Longitudinal Electronic Health Record

被引:98
作者
Zhang, Jinghe [1 ]
Kowsari, Kamran [1 ,2 ]
Harrison, James H. [3 ,4 ,5 ]
Lobo, Jennifer M. [3 ,5 ]
Barnes, Laura E. [1 ,2 ,5 ]
机构
[1] Univ Virginia, Dept Syst & Informat Engn, Charlottesville, VA 22904 USA
[2] Univ Virginia, Sensing Syst Hlth Lab, Charlottesville, VA 22904 USA
[3] Univ Virginia, Dept Publ Hlth Sci, Charlottesville, VA 22904 USA
[4] Univ Virginia, Div Lab Med, Dept Pathol, Charlottesville, VA 22904 USA
[5] Univ Virginia, Data Sci Inst, Charlottesville, VA 22904 USA
关键词
Attention mechanism; gated recurrent unit; hospitalization; longitudinal electronic health record; personalization; representation learning; RISK PREDICTION MODELS; HOSPITALIZATION;
D O I
10.1109/ACCESS.2018.2875677
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The wide implementation of electronic health record (EHR) systems facilitates the collection of large-scale health data from real clinical settings. Despite the significant increase in adoption of EHR systems, these data remain largely unexplored, but present a rich data source for knowledge discovery from patient health histories in tasks, such as understanding disease correlations and predicting health outcomes. However, the heterogeneity, sparsity, noise, and bias in these data present many complex challenges. This complexity makes it difficult to translate potentially relevant information into machine learning algorithms. In this paper, we propose a computational framework, Patient2Vec, to learn an interpretable deep representation of longitudinal EHR data, which is personalized for each patient. To evaluate this approach, we apply it to the prediction of future hospitalizations using real EHR data and compare its predictive performance with baseline methods. Patient2Vec produces a vector space with meaningful structure, and it achieves an area under curve around 0.799, outperforming baseline methods. In the end, the learned feature importance can be visualized and interpreted at both the individual and population levels to bring clinical insights.
引用
收藏
页码:65333 / 65346
页数:14
相关论文
共 41 条
[1]  
Agency for Healthcare Research and Quality (AHRQ), 2015, CLIN CLASS SOFTW CCS
[2]  
[Anonymous], 2017, GENERATIVE DISCRIMIN
[3]   Bidirectional LSTM Recurrent Neural Network for Keyphrase Extraction [J].
Basaldella, Marco ;
Antolli, Elisa ;
Serra, Giuseppe ;
Tasso, Carlo .
DIGITAL LIBRARIES AND MULTIMEDIA ARCHIVES, IRCDL 2018, 2018, 806 :180-187
[4]  
Britz D, 2015, RECURRENT NEURAL NET
[5]   Recurrent Neural Networks for Multivariate Time Series with Missing Values [J].
Che, Zhengping ;
Purushotham, Sanjay ;
Cho, Kyunghyun ;
Sontag, David ;
Liu, Yan .
SCIENTIFIC REPORTS, 2018, 8
[6]  
Che Zhengping, 2016, AMIA Annu Symp Proc, V2016, P371
[7]  
Cho K., 2014, ARXIV14061078, P1724, DOI 10.3115/V1/D14-1179
[8]  
Choi E, 2016, ADV NEUR IN, V29
[9]  
Choi Edward, 2016, JMLR Workshop Conf Proc, V56, P301
[10]  
Chung J., 2014, ARXIV