Generating Semantic Adversarial Examples via Feature Manipulation in Latent Space

被引:5
作者
Wang, Shuo [1 ]
Chen, Shangyu [2 ]
Chen, Tianle [3 ]
Nepal, Surya [1 ]
Rudolph, Carsten [2 ]
Grobler, Marthie [1 ]
机构
[1] CSIRO, Data61 & Cybersecur CRC, Marsfield, NSW 2122, Australia
[2] Monash Univ, Fac Informat Technol, Melbourne, Vic 3800, Australia
[3] Univ Queensland, St Lucia, Qld 4072, Australia
关键词
Adversarial examples; feature manipulation; latent representation; neural networks; variational autoencoder (VAE);
D O I
10.1109/TNNLS.2023.3299408
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The susceptibility of deep neural networks (DNNs) to adversarial intrusions, exemplified by adversarial examples, is well-documented. Conventional attacks implement unstructured, pixel-wise perturbations to mislead classifiers, which often results in a noticeable departure from natural samples and lacks human-perceptible interpretability. In this work, we present an adversarial attack strategy that implements fine-granularity, semantic-meaning-oriented structural perturbations. Our proposed methodology manipulates the semantic attributes of images through the use of disentangled latent codes. We engineer adversarial perturbations by manipulating either a single latent code or a combination thereof. To this end, we propose two unsupervised semantic manipulation strategies: one based on vector-disentangled representation and the other on feature map-disentangled representation, taking into consideration the complexity of the latent codes and the smoothness of the reconstructed images. Our empirical evaluations, conducted extensively on real-world image data, showcase the potency of our attacks, particularly against black-box classifiers. Furthermore, we establish the existence of a universal semantic adversarial example that is agnostic to specific images.
引用
收藏
页码:17070 / 17084
页数:15
相关论文
共 39 条
[31]   DeepFool: a simple and accurate method to fool deep neural networks [J].
Moosavi-Dezfooli, Seyed-Mohsen ;
Fawzi, Alhussein ;
Frossard, Pascal .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :2574-2582
[32]   SemanticAdv: Generating Adversarial Examples via Attribute-Conditioned Image Editing [J].
Qiu, Haonan ;
Xiao, Chaowei ;
Yang, Lei ;
Yan, Xinchen ;
Lee, Honglak ;
Li, Bo .
COMPUTER VISION - ECCV 2020, PT XIV, 2020, 12359 :19-37
[33]  
Reed S, 2016, Arxiv, DOI arXiv:1605.05396
[34]  
Szegedy C, 2014, Arxiv, DOI [arXiv:1312.6199, 10.48550/arXiv.1312.6199, DOI 10.1109/CVPR.2015.7298594]
[35]  
Wang Z, 2003, CONF REC ASILOMAR C, P1398
[36]   INFORMATION THEORETICAL ANALYSIS OF MULTIVARIATE CORRELATION [J].
WATANABE, S .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 1960, 4 (01) :66-82
[37]  
Zhao SY, 2020, Arxiv, DOI arXiv:2006.10738
[38]  
Zhao Zhengli., 2018, arXiv
[39]   Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks [J].
Zhu, Jun-Yan ;
Park, Taesung ;
Isola, Phillip ;
Efros, Alexei A. .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :2242-2251