Principal Component Adversarial Example

被引:39
作者
Zhang, Yonggang [1 ]
Tian, Xinmei [1 ]
Li, Ya [1 ]
Wang, Xinchao [2 ]
Tao, Dacheng [3 ,4 ]
机构
[1] Univ Sci & Technol China, Dept Elect Engn & Informat Sci, Hefei 230027, Peoples R China
[2] Stevens Inst Technol, Dept Comp Sci, Hoboken, NJ 07030 USA
[3] Univ Sydney, UBTECH Sydney Artificial Intelligence Ctr, Sydney, NSW 2008, Australia
[4] Univ Sydney, Sch Comp Sci, Fac Engn, Sydney, NSW 2008, Australia
关键词
Manifolds; Neural networks; Perturbation methods; Distortion; Task analysis; Robustness; Principal component analysis; Deep learning; adversarial examples; classification; manifold learning; NEURAL-NETWORKS; DEEP; REPRESENTATION; ROBUSTNESS;
D O I
10.1109/TIP.2020.2975918
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Despite having achieved excellent performance on various tasks, deep neural networks have been shown to be susceptible to adversarial examples, i.e., visual inputs crafted with structural imperceptible noise. To explain this phenomenon, previous works implicate the weak capability of the classification models and the difficulty of the classification tasks. These explanations appear to account for some of the empirical observations but lack deep insight into the intrinsic nature of adversarial examples, such as the generation method and transferability. Furthermore, previous works generate adversarial examples completely rely on a specific classifier (model). Consequently, the attack ability of adversarial examples is strongly dependent on the specific classifier. More importantly, adversarial examples cannot be generated without a trained classifier. In this paper, we raise a question: what is the real cause of the generation of adversarial examples? To answer this question, we propose a new concept, called the adversarial region, which explains the existence of adversarial examples as perturbations perpendicular to the tangent plane of the data manifold. This view yields a clear explanation of the transfer property across different models of adversarial examples. Moreover, with the notion of the adversarial region, we propose a novel target-free method to generate adversarial examples via principal component analysis. We verify our adversarial region hypothesis on a synthetic dataset and demonstrate through extensive experiments on real datasets that the adversarial examples generated by our method have competitive or even strong transferability compared with model-dependent adversarial example generating methods. Moreover, our experiment shows that the proposed method is more robust to defensive methods than previous methods.
引用
收藏
页码:4804 / 4815
页数:12
相关论文
共 46 条
[1]   Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey [J].
Akhtar, Naveed ;
Mian, Ajmal .
IEEE ACCESS, 2018, 6 :14410-14430
[2]  
Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640
[3]  
[Anonymous], P NEUR INF PROC SYST
[4]  
[Anonymous], 2015, ARXIV151105122
[5]  
[Anonymous], 2013, INTRIGUING PROPERTIE
[6]  
[Anonymous], 2017, ARXIV
[7]  
[Anonymous], 2017, GENERATIVE ADVERSARI
[8]  
[Anonymous], 2017, ABS171010766 CORR
[9]  
[Anonymous], Adversarial Spheres,
[10]  
[Anonymous], 2016, Deep Learning