Deep imputation of missing values in time series health data: A review with benchmarking

被引:13
|
作者
Kazijevs, Maksims [1 ]
Samad, Manar D. [1 ]
机构
[1] Tennessee State Univ, Dept Comp Sci, Nashville, TN 37209 USA
基金
美国国家卫生研究院;
关键词
Time series; Multivariate data; Longitudinal imputation; Cross-sectional imputation; Missing value imputation; Deep neural network; Electronic health records; Sensor data;
D O I
10.1016/j.jbi.2023.104440
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
The imputation of missing values in multivariate time series (MTS) data is critical in ensuring data quality and producing reliable data-driven predictive models. Apart from many statistical approaches, a few recent studies have proposed state-of-the-art deep learning methods to impute missing values in MTS data. However, the evaluation of these deep methods is limited to one or two data sets, low missing rates, and completely random missing value types. This survey performs six data-centric experiments to benchmark state-of-the-art deep imputation methods on five time series health data sets. Our extensive analysis reveals that no single imputation method outperforms the others on all five data sets. The imputation performance depends on data types, individual variable statistics, missing value rates, and types. Deep learning methods that jointly perform cross-sectional (across variables) and longitudinal (across time) imputations of missing values in time series data yield statistically better data quality than traditional imputation methods. Although computationally expensive, deep learning methods are practical given the current availability of high-performance computing resources, especially when data quality and sample size are of paramount importance in healthcare informatics. Our findings highlight the importance of data-centric selection of imputation methods to optimize data-driven predictive models.
引用
收藏
页数:18
相关论文
共 50 条
  • [31] Long-term missing value imputation for time series data using deep neural networks
    Park, Jangho
    Muller, Juliane
    Arora, Bhavna
    Faybishenko, Boris
    Pastorello, Gilberto
    Varadharajan, Charuleka
    Sahu, Reetik
    Agarwal, Deborah
    NEURAL COMPUTING & APPLICATIONS, 2023, 35 (12): : 9071 - 9091
  • [32] Long-term missing value imputation for time series data using deep neural networks
    Jangho Park
    Juliane Müller
    Bhavna Arora
    Boris Faybishenko
    Gilberto Pastorello
    Charuleka Varadharajan
    Reetik Sahu
    Deborah Agarwal
    Neural Computing and Applications, 2023, 35 : 9071 - 9091
  • [33] Data Imputation for Multivariate Time Series Sensor Data With Large Gaps of Missing Data
    Wu, Rui
    Hamshaw, Scott D.
    Yang, Lei
    Kincaid, Dustin W.
    Etheridge, Randall
    Ghasemkhani, Amir
    IEEE SENSORS JOURNAL, 2022, 22 (11) : 10671 - 10683
  • [34] Imputation of continuous missing values in profile data
    Yang, Luo
    Wang, Kaibo
    QUALITY AND RELIABILITY ENGINEERING INTERNATIONAL, 2022, 38 (07) : 3644 - 3662
  • [35] Imputation strategies for missing data in environmental time series for an unlucky situation
    Mendola, D
    INNOVATIONS IN CLASSIFICATION, DATA SCIENCE, AND INFORMATION SYSTEMS, 2005, : 275 - 282
  • [36] Review: A gentle introduction to imputation of missing values
    Donders, A. Rogier T.
    van der Heijden, Geert J. M. G.
    Stijnen, Theo
    Moons, Karel G. M.
    JOURNAL OF CLINICAL EPIDEMIOLOGY, 2006, 59 (10) : 1087 - 1091
  • [37] Monitoring Time Series with Missing Values: A Deep Probabilistic Approach
    Barazani, Oshri
    Tolpin, David
    CYBER SECURITY, CRYPTOLOGY, AND MACHINE LEARNING, 2022, 13301 : 19 - 28
  • [38] A time series continuous missing values imputation method based on generative adversarial networks
    Wang, Yunsheng
    Xu, Xinghan
    Hu, Lei
    Fan, Jianchao
    Han, Min
    KNOWLEDGE-BASED SYSTEMS, 2024, 283
  • [39] Missing Value Imputation on Multidimensional Time Series
    Bansal, Parikshit
    Deshpande, Prathamesh
    Sarawagi, Sunita
    PROCEEDINGS OF THE VLDB ENDOWMENT, 2021, 14 (11): : 2533 - 2545
  • [40] Missing data imputation of high-resolution temporal climate time series data
    Afrifa-Yamoah, E.
    Mueller, U. A.
    Taylor, S. M.
    Fisher, A. J.
    METEOROLOGICAL APPLICATIONS, 2020, 27 (01)