Learning hidden patterns from patient multivariate time series data using convolutional neural networks: A case study of healthcare cost prediction

被引:29
作者
Morid, Mohammad Amin [1 ]
Sheng, Olivia R. Liu [2 ]
Kawamoto, Kensaku [3 ]
Abdelrahman, Samir [3 ,4 ]
机构
[1] Santa Clara Univ, Leavey Sch Business, Dept Informat Syst & Analyt, Santa Clara, CA USA
[2] Univ Utah, David Eccles Sch Business, Dept Operat & Informat Syst, Salt Lake City, UT 84108 USA
[3] Univ Utah, Dept Biomed Informat, 421 Wakara Way, Salt Lake City, UT 84108 USA
[4] Cairo Univ, Comp Sci Dept, Giza, Egypt
关键词
Healthcare cost prediction; Representation learning; Temporal pattern detection; Deep learning; Convolutional neural networks; Healthcare claims data;
D O I
10.1016/j.jbi.2020.103565
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Objective: To develop an effective and scalable individual-level patient cost prediction method by automatically learning hidden temporal patterns from multivariate time series data in patient insurance claims using a convolutional neural network (CNN) architecture. Methods: We used three years of medical and pharmacy claims data from 2013 to 2016 from a healthcare insurer, where data from the first two years were used to build the model to predict costs in the third year. The data consisted of the multivariate time series of cost, visit and medical features that were shaped as images of patients' health status (i.e., matrices with time windows on one dimension and the medical, visit and cost features on the other dimension). Patients' multivariate time series images were given to a CNN method with a proposed architecture. After hyper-parameter tuning, the proposed architecture consisted of three building blocks of convolution and pooling layers with an LReLU activation function and a customized kernel size at each layer for healthcare data. The proposed CNN learned temporal patterns became inputs to a fully connected layer. We benchmarked the proposed method against three other methods: (1) a spike temporal pattern detection method, as the most accurate method for healthcare cost prediction described to date in the literature; (2) a symbolic temporal pattern detection method, as the most common approach for leveraging healthcare temporal data; and (3) the most commonly used CNN architectures for image pattern detection (i.e., AlexNet, VGGNet and ResNet) (via transfer learning). Moreover, we assessed the contribution of each type of data (i.e., cost, visit and medical). Finally, we externally validated the proposed method against a separate cohort of patients. All prediction performances were measured in terms of mean absolute percentage error (MAPE). Results: The proposed CNN configuration outperformed the spike temporal pattern detection and symbolic temporal pattern detection methods with a MAPE of 1.67 versus 2.02 and 3.66, respectively (p < 0.01). The proposed CNN outperformed ResNet, AlexNet and VGGNet with MAPEs of 4.59, 4.85 and 5.06, respectively (p < 0.01). Removing medical, visit and cost features resulted in MAPEs of 1.98, 1.91 and 2.04, respectively (p < 0.01). Conclusions: Feature learning through the proposed CNN configuration significantly improved individual-level healthcare cost prediction. The proposed CNN was able to outperform temporal pattern detection methods that look for a pre-defined set of pattern shapes, since it is capable of extracting a variable number of patterns with various shapes. Temporal patterns learned from medical, visit and cost data made significant contributions to the prediction performance. Hyper-parameter tuning showed that considering three-month data patterns has the highest prediction accuracy. Our results showed that patients' images extracted from multivariate time series data are different from regular images, and hence require unique designs of CNN architectures. The proposed method for converting multivariate time series data of patients into images and tuning them for convolutional learning could be applied in many other healthcare applications with multivariate time series data.
引用
收藏
页数:11
相关论文
共 50 条
[41]   Deep learning-based prediction of particle size distributions in construction and demolition waste recycling using convolutional neural networks on 3D laser triangulation data [J].
Kroell, Nils ;
Thor, Eric ;
Goebbels, Lieve ;
Schoenfelder, Paula ;
Chen, Xiaozheng .
CONSTRUCTION AND BUILDING MATERIALS, 2025, 466
[42]   Grid Distribution Fault Occurrence and Remedial Measures Prediction/Forecasting through Different Deep Learning Neural Networks by Using Real Time Data from Tabuk City Power Grid [J].
Almasoudi, Fahad M. .
ENERGIES, 2023, 16 (03)
[43]   Forecasting Temperature Time Series Data Using Combined Statistical and Deep Learning Methods: A Case Study of Nairobi County Daily Temperature [J].
Mutinda, John Kamwele ;
Langat, Amos Kipkorir ;
Mwalili, Samuel Musili .
INTERNATIONAL JOURNAL OF MATHEMATICS AND MATHEMATICAL SCIENCES, 2025, 2025 (01)
[44]   Deep Learning-Based PM2.5 Long Time-Series Prediction by Fusing Multisource Data-A Case Study of Beijing [J].
Niu, Meng ;
Zhang, Yuqing ;
Ren, Zihe .
ATMOSPHERE, 2023, 14 (02)
[45]   Performance change with the number of training data: A case study on the binary classification of COVID-19 chest X-ray by using convolutional neural networks [J].
Imagawa, Kuniki ;
Shiomoto, Kohei .
COMPUTERS IN BIOLOGY AND MEDICINE, 2022, 142
[46]   Landslide detection with ALOS-2/PALSAR-2 data using convolutional neural networks: a case study of 2018 Hokkaido Eastern Iburi earthquake [J].
Konishi, Tomohisa ;
Suga, Yuzo .
ACTIVE AND PASSIVE MICROWAVE REMOTE SENSING FOR ENVIRONMENTAL MONITORING III, 2019, 11154
[47]   Feasibility of reconstructing in-vivo patient 3D dose distributions from 2D EPID image data using convolutional neural networks [J].
Gao, Ning ;
Cheng, Bo ;
Wang, Zhi ;
Li, Didi ;
Chang, Yankui ;
Ren, Qiang ;
Pei, Xi ;
Shi, Chengyu ;
Xu, Xie George .
PHYSICS IN MEDICINE AND BIOLOGY, 2025, 70 (01)
[48]   Advanced Soil Organic Matter Prediction with a Regional Soil NIR Spectral Library Using Long Short-Term Memory-Convolutional Neural Networks: A Case Study [J].
Miao, Tianyu ;
Ji, Wenjun ;
Li, Baoguo ;
Zhu, Xicun ;
Yin, Jianxin ;
Yang, Jiajie ;
Huang, Yuanfang ;
Cao, Yan ;
Yao, Dongheng ;
Kong, Xiangbin .
REMOTE SENSING, 2024, 16 (07)
[49]   Transferable Deep Learning from Time Series of Landsat Data for National Land-Cover Mapping with Noisy Labels: A Case Study of China [J].
Zhao, Xuemei ;
Hong, Danfeng ;
Gao, Lianru ;
Zhang, Bing ;
Chanussot, Jocelyn .
REMOTE SENSING, 2021, 13 (21)
[50]   Multitask Learning With Recurrent Neural Networks for Acute Respiratory Distress Syndrome Prediction Using Only Electronic Health Record Data: Model Development and Validation Study [J].
Lam, Carson ;
Thapa, Rahul ;
Maharjan, Jenish ;
Rahmani, Keyvan ;
Tso, Chak Foon ;
Singh, Navan Preet ;
Chetty, Satish Casie ;
Mao, Qingqing .
JMIR MEDICAL INFORMATICS, 2022, 10 (06)