A multistage deep imputation framework for missing values large segment imputation with statistical metrics

被引:5
作者
Yang, JinSheng [1 ]
Shao, YuanHai [1 ]
Li, ChunNa [1 ]
Wang, WenSi [2 ]
机构
[1] Hainan Univ, Management Sch, Haikou 570228, Peoples R China
[2] Beijing Univ Technol, Fac Informat Technol, Beijing 100021, Peoples R China
基金
中国国家自然科学基金;
关键词
Missing values; Imputation; Deep learning; Multistage framework; Performance metrics; CHAINED EQUATIONS;
D O I
10.1016/j.asoc.2023.110654
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The presence of missing values is a pervasive and unavoidable phenomenon in sensor data. Despite numerous efforts from researchers to address this issue through imputation techniques, particularly in deep learning models, the unique data distributions and periods inherent in real-world sensor data are often neglected. This paper presents a novel, multistage deep learning-based imputation framework with adaptability to missing value imputation. The framework incorporates a mixture measurement index that accounts for both low-and higher-order statistical aspects of data distribution and a more adaptive evaluation metric, which improves upon traditional mean squared error. Additionally, a multistage imputation strategy and dynamic data length adjustment are integrated into the imputation process to account for variations in data periods. Empirical results on diverse sensor data demonstrate the superiority of the proposed framework, particularly in addressing large segment imputation issues, as evidenced by improved imputation performance. The implementation and experimental results have been made publicly available on GitHub. 1 & COPY; 2023 Published by Elsevier B.V.
引用
收藏
页数:17
相关论文
共 51 条
  • [11] Partial imputation of unseen records to improve classification using a hybrid multi-layered artificial immune system and genetic algorithm
    Duma, Mlungisi
    Marwala, Tshilidzi
    Twala, Bhekisipho
    Nelwamondo, Fulufhelo
    [J]. APPLIED SOFT COMPUTING, 2013, 13 (12) : 4461 - 4480
  • [12] Fang CG, 2020, Arxiv, DOI [arXiv:2011.11347, 10.48550/arXiv.2011.11347]
  • [13] A novel framework for imputation of missing values in databases
    Farhangfar, Alireza
    Kurgan, Lukasz A.
    Pedrycz, Witold
    [J]. IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2007, 37 (05): : 692 - 709
  • [14] Ghahramani Z., 1994, Advances in Neural Information Processing Systems, P120
  • [15] A data imputation method for multivariate time series based on generative adversarial network
    Guo, Zijian
    Wan, Yiming
    Ye, Hao
    [J]. NEUROCOMPUTING, 2019, 360 : 185 - 197
  • [16] Gupta M., 2020, arXiv
  • [17] DLIN: Deep Ladder Imputation Network
    Hallaji, Ehsan
    Razavi-Far, Roozbeh
    Saif, Mehrdad
    [J]. IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (09) : 8629 - 8641
  • [18] Nearest neighbor imputation of species-level, plot-scale forest structure attributes from LiDAR data
    Hudak, Andrew T.
    Crookston, Nicholas L.
    Evans, Jeffrey S.
    Hall, David E.
    Falkowski, Michael J.
    [J]. REMOTE SENSING OF ENVIRONMENT, 2008, 112 (05) : 2232 - 2245
  • [19] Methods for imputation of missing values in air quality data sets
    Junninen, H
    Niska, H
    Tuppurainen, K
    Ruuskanen, J
    Kolehmainen, M
    [J]. ATMOSPHERIC ENVIRONMENT, 2004, 38 (18) : 2895 - 2907
  • [20] A large-scale sensor missing data imputation framework for dams using deep learning and transfer learning strategy
    Li, Yangtao
    Bao, Tengfei
    Chen, Hao
    Zhang, Kang
    Shu, Xiaosong
    Chen, Zexun
    Hu, Yuhan
    [J]. MEASUREMENT, 2021, 178