Medical Data Wrangling With Sequential Variational Autoencoders

被引:2
作者
Barrejon, Daniel [1 ,2 ]
Olmos, Pablo M. [1 ,2 ]
Artes-Rodriguez, Antonio [1 ,2 ]
机构
[1] UC3M, Dept Signal Theory & Commun, Madrid 28911, Spain
[2] Gregorio Maranon Hlth Res Inst Spain, Madrid 28911, Spain
基金
欧洲研究理事会;
关键词
Data models; Correlation; Monitoring; Databases; Measurement; Hospitals; Time series analysis; Deep learning; VAE; missing data; heterogeneous; sequential data;
D O I
10.1109/JBHI.2021.3123839
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Medical data sets are usually corrupted by noise and missing data. These missing patterns are commonly assumed to be completely random, but in medical scenarios, the reality is that these patterns occur in bursts due to sensors that are off for some time or data collected in a misaligned uneven fashion, among other causes. This paper proposes to model medical data records with heterogeneous data types and bursty missing data using sequential variational autoencoders (VAEs). In particular, we propose a new methodology, the Shi-VAE, which extends the capabilities of VAEs to sequential streams of data with missing observations. We compare our model against state-of-the-art solutions in an intensive care unit database (ICU) and a dataset of passive human monitoring. Furthermore, we find that standard error metrics such as RMSE are not conclusive enough to assess temporal models and include in our analysis the cross-correlation between the ground truth and the imputed signal. We show that Shi-VAE achieves the best performance in terms of using both metrics, with lower computational complexity than the GP-VAE model, which is the state-of-the-art method for medical records.
引用
收藏
页码:2737 / 2745
页数:9
相关论文
共 35 条
  • [21] Luo YH, 2018, ADV NEUR IN, V31
  • [22] Ma C., 2018, ARXIV180911142
  • [23] Ma Chao, 2020, Advances in Neural Information Processing Systems, V33
  • [24] Mattei Pierre-Alexandre, 2019, P MACHINE LEARNING R, V97
  • [25] Mehrotra Abhinav, 2017, Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies, V1, DOI [10.1145/3130948, 10.1145/3131901]
  • [26] Handling incomplete heterogeneous data using VAEs
    Nazabal, Alfredo
    Olmos, Pablo M.
    Ghahramani, Zoubin
    Valera, Isabel
    [J]. PATTERN RECOGNITION, 2020, 107
  • [27] Pedregosa F, 2011, J MACH LEARN RES, V12, P2825
  • [28] Genomic data imputation with variational auto-encoders
    Qiu, Yeping Lina
    Zheng, Hong
    Gevaert, Olivier
    [J]. GIGASCIENCE, 2020, 9 (08):
  • [29] Rasmussen CE, 2005, ADAPT COMPUT MACH LE, P1
  • [30] Rubinsteyn A., fancyimpute: An Imputation Library for Python