A Multi-directional Approach for Missing Value Estimation in Multivariate Time Series Clinical Data

被引:6
|
作者
Xu, Xiao [1 ]
Liu, Xiaoshuang [1 ]
Kang, Yanni [1 ]
Xu, Xian [1 ]
Wang, Junmei [1 ]
Sun, Yuyao [1 ]
Chen, Quanhe [1 ]
Jia, Xiaoyu [1 ]
Ma, Xinyue [1 ]
Meng, Xiaoyan [1 ]
Li, Xiang [1 ]
Xie, Guotong [1 ]
机构
[1] Ping Hlth Technol, Beijing, Peoples R China
关键词
Multi-directional; Missing Value Estimation; Multivariate time series; Feature engineering; Gradient boosting tree; IMPUTATION;
D O I
10.1007/s41666-020-00076-2
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Missing values are common in clinical datasets which bring obstacles for clinical data analysis. Correctly estimating the missing parts plays a critical role in utilizing these analysis approaches. However, only limited works focus on the missing value estimation of multivariate time series (MTS) clinical data, which is one of the most challenge data types in this area. We attempt to develop a methodology (MD-MTS) with high accuracy for the missing value estimation in MTS clinical data. In MD-MTS, temporal and cross-variable information are constructed as multi-directional features for an efficient gradient boosting decision tree (LightGBM). For each patient, temporal information represents the sequential relations among the values of one variable in different time-stamps, and cross-variable information refers to the correlations among the values of different variables in a fixed time-stamp. We evaluated the estimation method performance based on the gap between the true values and the estimated values on the randomly masked parts. MD-MTS outperformed three baseline methods (3D-MICE, Amelia II and BRITS) on the ICHI challenge 2019 datasets that containing 13 time series variables. The root-mean-square error of MD-MTS, 3D-MICE, Amelia II and BRITS on offline-test dataset are 0.1717, 0.2247, 0.1900, and 0.1862, respectively. On online-test dataset, the performance for the former three methods is 0.1720, 0.2235, and 0.1927, respectively. Furthermore, MD-MTS got the first in ICHI challenge 2019 among dozens of competition models. MD-MTS provides an accurate and robust approach for estimating the missing values in MTS clinical data, which can be easily used as a preprocessing step for the downstream clinical data analysis.
引用
收藏
页码:365 / 382
页数:18
相关论文
共 50 条
  • [1] A Multi-directional Approach for Missing Value Estimation in Multivariate Time Series Clinical Data
    Xiao Xu
    Xiaoshuang Liu
    Yanni Kang
    Xian Xu
    Junmei Wang
    Yuyao Sun
    Quanhe Chen
    Xiaoyu Jia
    Xinyue Ma
    Xiaoyan Meng
    Xiang Li
    Guotong Xie
    Journal of Healthcare Informatics Research, 2020, 4 : 365 - 382
  • [2] Imputation of Missing Value Using Dynamic Bayesian Network for Multivariate Time Series Data
    Susanti, Steffi Pauli
    Azizah, Fazat Nur
    PROCEEDINGS OF 2017 INTERNATIONAL CONFERENCE ON DATA AND SOFTWARE ENGINEERING (ICODSE), 2017,
  • [3] Method of missing data imputation for multivariate time series
    Li Z.
    Zhang F.
    Wang Y.
    Tao Q.
    Li C.
    2018, Chinese Institute of Electronics (40): : 225 - 230
  • [4] Learning representations of multivariate time series with missing data
    Bianchi, Filippo Maria
    Livi, Lorenzo
    Mikalsen, Karl Oyvind
    Kampffmeyer, Michael
    Jenssen, Robert
    PATTERN RECOGNITION, 2019, 96
  • [5] Estimating Missing Data in Temporal Data Streams Using Multi-Directional Recurrent Neural Networks
    Yoon, Jinsung
    Zame, William R.
    van der Schaar, Mihaela
    IEEE TRANSACTIONS ON BIOMEDICAL ENGINEERING, 2019, 66 (05) : 1477 - 1490
  • [6] Missing Data Imputation for Multivariate Time series in Industrial IoT: A Federated Learning Approach
    Gkillas, Alexandros
    Lalos, Aris S.
    2022 IEEE 20TH INTERNATIONAL CONFERENCE ON INDUSTRIAL INFORMATICS (INDIN), 2022, : 87 - 94
  • [7] Research on Methods of Filling Missing Data for Multivariate Time Series
    Li, Zheng-Xin
    Wu, Shi-Hui
    Li, Chao
    Zhang, Yu
    2017 IEEE 2ND INTERNATIONAL CONFERENCE ON BIG DATA ANALYSIS (ICBDA), 2017, : 387 - 390
  • [8] On an Multi-directional Searching Algorithm for Two Fuzzy Clustering Methods for Categorical Multivariate Data
    Suzuki, Kazune
    Kanzawa, Yuchi
    INTEGRATED UNCERTAINTY IN KNOWLEDGE MODELLING AND DECISION MAKING (IUKM 2022), 2022, 13199 : 182 - 190
  • [9] MTSSP: Missing Value Imputation in Multivariate Time Series for Survival Prediction
    Li, Bo
    Shi, Yuliang
    Cheng, Lin
    Yan, Zhongmin
    Wang, Xinjun
    Li, Hui
    2022 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2022,
  • [10] GLIMA: Global and Local Time Series Imputation with Multi-directional Attention Learning
    Suo, Qiuling
    Zhong, Weida
    Xun, Guangxu
    Sun, Jianhui
    Chen, Changyou
    Zhang, Aidong
    2020 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2020, : 798 - 807