A Novel Deep Similarity Learning Approach to Electronic Health Records Data

被引:15
作者
Gupta, Vagisha [1 ]
Sachdeva, Shelly [1 ]
Bhalla, Subhash [2 ]
机构
[1] Natl Inst Technol Delhi NITD, Dept Comp Sci & Engn, Delhi 110040, India
[2] Univ Aizu, Dept Comp Sci & Engn, Aizu Wakamatsu, Fukushima 9658580, Japan
关键词
Deep learning; Diseases; Medical services; Predictive models; Data models; Medical diagnostic imaging; Feature extraction; Convolutional neural networks; deep learning; electronic health records; nephrology; similarity learning; softmax-based technique; MISSING DATA; MULTIPLE IMPUTATION; RISK PREDICTION; MACHINE; ALGORITHM; CHALLENGES; FRAMEWORK; DISEASE; SYSTEM;
D O I
10.1109/ACCESS.2020.3037710
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The past decade has seen a tremendous advancement in using Electronic Health Records (EHRs) to offer clinical decision support and provide personalized healthcare to patients. Despite the potential benefits offered by EHR data, it is challenging to represent and analyze large EHRs for predictive modeling due to heterogeneity, high dimensionality, and sparsity. This article proposes a novel supervised Deep Similarity Learning approach that learns the patient representations and also finds the relationship between the patients using pairwise similarity learning to facilitate predictive analysis for personalized healthcare. We develop CNN_Softmax which is a Siamese-based neural network for multi-class classification methods corresponding to the prediction of disease. It uses Convolutional Neural Network (CNN) to study the vector representation of raw EHRs and capture essential information of patient features, and a Softmax-based supervised classification method that learns the similarity between pairs of patients and performs disease prediction using this similarity information. Our approach uses data type mapping to handle heterogeneity and the polynomial interpolation method to handle sparsity existing in EHR data. ORBDA, which is an openEHR (standard) benchmark dataset, is used for evaluating this study. Experimental results show that CNN_Softmax achieves an accuracy of 97.8%, a recall of 98.1%, a precision of 96.02%, and an F1 score of 97.82%. The comparative results show that our proposed novel methodology performs disease prediction with highly promising results and outperforms state-of-the-art similarity learning methods. The current study is the first attempt to perform disease prediction on standardized EHRs, to the best of the authors' knowledge. The deep similarity learning approach provides support for clinical decision making that is more reliable and generalizable than previous approaches and focuses on dealing with heterogeneous and sparse data. The concept also serves as a new implementation of artificial intelligence technologies for the application of clinical big data.
引用
收藏
页码:209278 / 209295
页数:18
相关论文
共 65 条
[1]  
Albawi S, 2017, I C ENG TECHNOL
[2]   Neural network and support vector machine for the prediction of chronic kidney disease: A comparative study [J].
Almansour, Njoud Abdullah ;
Syed, Hajra Fahim ;
Khayat, Nuha Radwan ;
Altheeb, Rawan Kanaan ;
Juri, Renad Emad ;
Alhiyafi, Jamal ;
Alrashed, Saleh ;
Olatunji, Sunday O. .
COMPUTERS IN BIOLOGY AND MEDICINE, 2019, 109 :101-111
[3]   Sehaa: A Big Data Analytics Tool for Healthcare Symptoms and Diseases Detection Using Twitter, Apache Spark, and Machine Learning [J].
Alotaibi, Shoayee ;
Mehmood, Rashid ;
Katib, Iyad ;
Rana, Omer ;
Albeshri, Aiiad .
APPLIED SCIENCES-BASEL, 2020, 10 (04)
[4]  
[Anonymous], 2019, GEN PUBL BAS EV PERS
[5]  
[Anonymous], INT C LEARNING REPRE
[6]  
[Anonymous], 2015, Tech. Rep.
[7]  
[Anonymous], 2012, Advances in Neural Information Processing Systems
[8]  
Bengio Y., 2012, P ICML WORKSH UNS TR, V7, P19
[9]   Representation Learning: A Review and New Perspectives [J].
Bengio, Yoshua ;
Courville, Aaron ;
Vincent, Pascal .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2013, 35 (08) :1798-1828
[10]  
Bhaskar Navaneeth, 2019, 2019 International Conference on Communication and Electronics Systems (ICCES), P1660, DOI 10.1109/ICCES45898.2019.9002214