PuVAE: A Variational Autoencoder to Purify Adversarial Examples

被引:45
作者
Hwang, Uiwon [1 ]
Park, Jaewoo [2 ]
Jang, Hyemi [1 ]
Yoon, Sungroh [1 ]
Cho, Nam Ik [2 ]
机构
[1] Seoul Natl Univ, Elect & Comp Engn, Seoul 08826, South Korea
[2] Seoul Natl Univ, INMC, Dept Elect & Comp Engn, Seoul 08826, South Korea
基金
新加坡国家研究基金会;
关键词
Adversarial attack; variational autoencoder; deep learning;
D O I
10.1109/ACCESS.2019.2939352
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep neural networks are widely used and exhibit excellent performance in many areas. However, they are vulnerable to adversarial attacks that compromise networks at inference time by applying elaborately designed perturbations to input data. Although several defense methods have been proposed to address specific attacks, other types of attacks can circumvent these defense mechanisms. Therefore, we propose Purifying Variational AutoEncoder (PuVAE), a method to purify adversarial examples. The proposed method eliminates an adversarial perturbation by projecting an adversarial example on the manifold of each class and determining the closest projection as a purified sample. We experimentally illustrate the robustness of PuVAE against various attack methods without any prior knowledge about the attacks. In our experiments, the proposed method exhibits performances that are competitive with state-of-the-art defense methods, and the inference time is approximately 130 times faster than that of Defense-GAN which is a state-of-the art purifier method.
引用
收藏
页码:126582 / 126593
页数:12
相关论文
共 39 条
[1]   Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey [J].
Akhtar, Naveed ;
Mian, Ajmal .
IEEE ACCESS, 2018, 6 :14410-14430
[2]  
[Anonymous], P 3 INT C LEARNING R
[3]  
[Anonymous], PROC CVPR IEEE
[4]  
[Anonymous], ARXIV171203742
[5]  
[Anonymous], 2017, ARXIV171010766
[6]  
[Anonymous], 2015, ARXIV151104599
[7]  
[Anonymous], 2018, ARXIV180711655
[8]  
[Anonymous], 2015, ARXIV150504366
[9]  
[Anonymous], 2015, ARXIV PREPRINT ARXIV
[10]  
[Anonymous], 2015, NEURIPS