Generative Adversarial Networks Imputation for High Rate Missing Values

被引:4
作者
Wang, Huan [1 ]
Chen, Yibin [1 ]
Shen, Bingyang [1 ]
Wu, Di [2 ]
Ban, Xiaojuan [1 ]
机构
[1] Univ Sci & Technol Beijing, Sch Comp & Commun Engn, Beijing, Peoples R China
[2] Norwegian Univ Sci & Technol, Dept ICT & Nat Sci, Alesund, Norway
来源
IEEE 2018 INTERNATIONAL CONGRESS ON CYBERMATICS / 2018 IEEE CONFERENCES ON INTERNET OF THINGS, GREEN COMPUTING AND COMMUNICATIONS, CYBER, PHYSICAL AND SOCIAL COMPUTING, SMART DATA, BLOCKCHAIN, COMPUTER AND INFORMATION TECHNOLOGY | 2018年
关键词
missing values; MCAR; generating model; GAN;
D O I
10.1109/Cybermatics_2018.2018.00121
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The issue of missing values (MVs) has been found widely in real-world datasets and obstructed the use of many statistical or machine learning algorithms for data analytics due to their incompetence in processing incomplete datasets. Most of the current MVs imputation methods apply to the datasets with certain specific types or low missing rate. To address this problem, we propose a new method the missing completely at random (MCAR) data with high missing rate. This method is based on generative adversarial networks (GAN) architecture. We execute the training process on discrete dataset with missing values, in order to ensure the generated dataset is completely similar to the feature distribution of original dataset. We conduct our experiments for two different datatypes to prove the feasibility and efficiency of this method. The first one is a public authority dataset with wireless sensors records. The second one is a large group of dataset collected from an industrial production monitoring process. The results compared with traditional missing values imputation methods have shown when the missing rate is higher than 30%, our method performs better in robustness and accuracy.
引用
收藏
页码:586 / 590
页数:5
相关论文
共 19 条
[1]  
[Anonymous], 2016, P 30 INT C NEURAL IN
[2]  
[Anonymous], J STAT SOFTWARE
[3]  
[Anonymous], 2002, ACM SIGKDD EXPLOR NE
[4]  
[Anonymous], IMPUTATION MISSING V
[5]  
[Anonymous], SCI REPORTS
[6]  
[Anonymous], SOFTW ENG
[7]  
[Anonymous], INT C NEUR INF PROC
[8]  
[Anonymous], FLEXIBLE IMPUTATION
[9]  
Arjovsky M., 2017, PRINCIPLED METHODS T
[10]   A Unified Approach to Measurement Error and Missing Data: Details and Extensions [J].
Blackwell, Matthew ;
Honaker, James ;
King, Gary .
SOCIOLOGICAL METHODS & RESEARCH, 2017, 46 (03) :342-369