An industrial missing values processing method based on generating model

被引:12
作者
Wang, Huan [1 ,2 ]
Yuan, Zhaolin [1 ]
Chen, Yibin [1 ]
Shen, Bingyang [1 ]
Wu, Aixiang [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Civil & Environm Engn, Beijing 100083, Peoples R China
[2] Anshan Iron & Steel Grp Corp, Anshan, Peoples R China
关键词
DAE; GAN; Generating model; IIOT; MCAR; Missing values; IMPUTATION;
D O I
10.1016/j.comnet.2019.02.007
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The issue of missing values (MVs) has been found widely in real-world datasets and obstructed the use of many statistical or machine learning algorithms for data analytics due to their incompetence in processing incomplete datasets. Most of the current MVs filling methods are applied to the datasets with certain specific types or low missing rate. This paper proposes a method of missing values processing based on the combination of denoising autoencoder (DAE) and generative adversarial networks (GAN), aiming at the missing completely at random (MCAR) datasets with high missing rate and noise interference in industrial scenes. We execute the training process on a discrete dataset with missing values, in order to ensure the generated dataset is completely similar to the feature distribution of the original dataset. We conduct our experiments for different dimensional datasets to prove the feasibility and efficiency of this method, including three public authority datasets and an industrial production monitoring dataset. The results compared with traditional missing values imputation methods have shown when the missing rate is higher than 30%, our method performs better in robustness and accuracy. (C) 2019 Published by Elsevier B.V.
引用
收藏
页码:61 / 68
页数:8
相关论文
共 15 条
  • [1] [Anonymous], SCI REP
  • [2] [Anonymous], 2002, ACM SIGKDD EXPLOR NE
  • [3] [Anonymous], J STAT SOFTW
  • [4] [Anonymous], FLEXIBLE IMPUTATION
  • [5] [Anonymous], SOFTW ENG
  • [6] [Anonymous], SCI REP
  • [7] Arjovsky M., 2017, PRINCIPLED METHODS T
  • [8] Goodfellow IJ, 2014, ADV NEUR IN, V27, P2672, DOI DOI 10.1145/3422622
  • [9] What to Do about Missing Values in Time-Series Cross-Section Data
    Honaker, James
    King, Gary
    [J]. AMERICAN JOURNAL OF POLITICAL SCIENCE, 2010, 54 (02) : 561 - 581
  • [10] Imputation of missing values for compositional data using classical and robust methods
    Hron, K.
    Templ, M.
    Filzmoser, P.
    [J]. COMPUTATIONAL STATISTICS & DATA ANALYSIS, 2010, 54 (12) : 3095 - 3107