Data-driven missing data imputation in cluster monitoring system based on deep neural network

被引:0
作者
Jie Lin
NianHua Li
Md Ashraful Alam
Yuqing Ma
机构
[1] University of Electronic Science and Technology of China,School of Computer Science and Engineering
来源
Applied Intelligence | 2020年 / 50卷
关键词
Missing data; Cluster monitoring; Deep belief networks; Multiple imputation;
D O I
暂无
中图分类号
学科分类号
摘要
Due to cluster instability, not in the cluster monitoring system. This paper focuses on the missing data imputation processing for the cluster monitoring application and proposes a new hybrid multiple imputation framework. This new imputation approach is different from the conventional multiple imputation technologies in the fact that it attempts to impute the missing data for an arbitrary missing pattern with a model-based and data-driven combination architecture. Essentially, the deep neural network, as the data model, extracts deep features from the data and deep features are further calculated then by a regression or data-driven strategies and used to create the estimation of missing data with the arbitrary missing pattern. This paper gives evidence that if we can train a deep neural network to construct the deep features of the data, imputation based on deep features is better than that directly on the original data. In the experiments, we compare the proposed method with other conventional multiple imputation approaches for varying missing data patterns, missing ratios, and different datasets including real cluster data. The result illustrates that when data encounters larger missing ratio and various missing patterns, the proposed algorithm has the ability to achieve more accurate and stable imputation performance.
引用
收藏
页码:860 / 877
页数:17
相关论文
共 49 条
  • [1] Massie ML(2004)The ganglia distributed monitoring system: design, implementation, and experience Parallel Comput 30 817-840
  • [2] Chun BN(2014)Multiple imputation of missing data using SAS Int Stat Rev 83 326-327
  • [3] Culler DE(2017)Random forest missing data algorithms Statistical Analysis and Data Mining: The ASA Data Science Journal 10 363-377
  • [4] Berglund P(2012)A global Water Quality Index and hot-deck imputation of missing data Ecol Indic 17 108-119
  • [5] Heeringa S(2015)A new missing data imputation algorithm applied to electrical data loggers Sensors 15 31069-31082
  • [6] Tang Fei(2017)Correlated Cluster-Based imputation for treatment of missing values, inproceedings of the first international conference on computational intelligence and informatics Adv Intell Syst Comput 507 171-178
  • [7] Ishwaran Hemant(2018)MIAEC: Missing Data Imputation Based on the Evidence Chain IEEE Access 6 12983-12992
  • [8] Srebotnjak T(2018)Local Similarity Imputation Based on Fast Clustering for Incomplete Data in Cyber-Physical Systems IEEE Syst J 12 1610-1620
  • [9] Carr G(2018)A class center based approach for missing value imputation Knowl-Based Syst 151 124-135
  • [10] Sherbinin AD(2015)Deep learning Nature 521 436-444