A multistage deep imputation framework for missing values large segment imputation with statistical metrics

被引:5
作者
Yang, JinSheng [1 ]
Shao, YuanHai [1 ]
Li, ChunNa [1 ]
Wang, WenSi [2 ]
机构
[1] Hainan Univ, Management Sch, Haikou 570228, Peoples R China
[2] Beijing Univ Technol, Fac Informat Technol, Beijing 100021, Peoples R China
基金
中国国家自然科学基金;
关键词
Missing values; Imputation; Deep learning; Multistage framework; Performance metrics; CHAINED EQUATIONS;
D O I
10.1016/j.asoc.2023.110654
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The presence of missing values is a pervasive and unavoidable phenomenon in sensor data. Despite numerous efforts from researchers to address this issue through imputation techniques, particularly in deep learning models, the unique data distributions and periods inherent in real-world sensor data are often neglected. This paper presents a novel, multistage deep learning-based imputation framework with adaptability to missing value imputation. The framework incorporates a mixture measurement index that accounts for both low-and higher-order statistical aspects of data distribution and a more adaptive evaluation metric, which improves upon traditional mean squared error. Additionally, a multistage imputation strategy and dynamic data length adjustment are integrated into the imputation process to account for variations in data periods. Empirical results on diverse sensor data demonstrate the superiority of the proposed framework, particularly in addressing large segment imputation issues, as evidenced by improved imputation performance. The implementation and experimental results have been made publicly available on GitHub. 1 & COPY; 2023 Published by Elsevier B.V.
引用
收藏
页数:17
相关论文
共 51 条
  • [1] Acar E., 2010, P 2010 SIAM INT C DA, P701, DOI DOI 10.1137/1.9781611972801.61
  • [2] A Comprehensive Survey on Imputation of Missing Data in Internet of Things
    Adhikari, Deepak
    Jiang, Wei
    Zhan, Jinyu
    He, Zhiyuan
    Rawat, Danda B.
    Aickelin, Uwe
    Khorshidi, Hadi A.
    [J]. ACM COMPUTING SURVEYS, 2023, 55 (07)
  • [3] Multiple imputation by chained equations: what is it and how does it work?
    Azur, Melissa J.
    Stuart, Elizabeth A.
    Frangakis, Constantine
    Leaf, Philip J.
    [J]. INTERNATIONAL JOURNAL OF METHODS IN PSYCHIATRIC RESEARCH, 2011, 20 (01) : 40 - 49
  • [4] Univariate and multivariate skewness and kurtosis for measuring nonnormality: Prevalence, influence and estimation
    Cain, Meghan K.
    Zhang, Zhiyong
    Yuan, Ke-Hai
    [J]. BEHAVIOR RESEARCH METHODS, 2017, 49 (05) : 1716 - 1735
  • [5] Cao W, 2018, Arxiv, DOI arXiv:1805.10572
  • [6] A Vision of IoT: Applications, Challenges, and Opportunities With China Perspective
    Chen, Shanzhi
    Xu, Hui
    Liu, Dake
    Hu, Bo
    Wang, Hucheng
    [J]. IEEE INTERNET OF THINGS JOURNAL, 2014, 1 (04): : 349 - 359
  • [7] Li SCX, 2019, Arxiv, DOI [arXiv:1902.09599, 10.48550/arXiv.1902.09599, DOI 10.48550/ARXIV.1902.09599]
  • [8] Choi TM, 2020, Arxiv, DOI arXiv:2010.10075
  • [9] Sequence-to-Sequence Imputation of Missing Sensor Data
    Dabrowski, Joel Janek
    Rahman, Ashfaqur
    [J]. AI 2019: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2019, 11919 : 265 - 276
  • [10] On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario
    De Vito, S.
    Massera, E.
    Piga, A.
    Martinotto, L.
    Di Francia, G.
    [J]. SENSORS AND ACTUATORS B-CHEMICAL, 2008, 129 (02) : 750 - 757