Generating Semantic Adversarial Examples via Feature Manipulation in Latent Space

被引:4
作者
Wang, Shuo [1 ]
Chen, Shangyu [2 ]
Chen, Tianle [3 ]
Nepal, Surya [1 ]
Rudolph, Carsten [2 ]
Grobler, Marthie [1 ]
机构
[1] CSIRO, Data61 & Cybersecur CRC, Marsfield, NSW 2122, Australia
[2] Monash Univ, Fac Informat Technol, Melbourne, Vic 3800, Australia
[3] Univ Queensland, St Lucia, Qld 4072, Australia
关键词
Adversarial examples; feature manipulation; latent representation; neural networks; variational autoencoder (VAE);
D O I
10.1109/TNNLS.2023.3299408
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The susceptibility of deep neural networks (DNNs) to adversarial intrusions, exemplified by adversarial examples, is well-documented. Conventional attacks implement unstructured, pixel-wise perturbations to mislead classifiers, which often results in a noticeable departure from natural samples and lacks human-perceptible interpretability. In this work, we present an adversarial attack strategy that implements fine-granularity, semantic-meaning-oriented structural perturbations. Our proposed methodology manipulates the semantic attributes of images through the use of disentangled latent codes. We engineer adversarial perturbations by manipulating either a single latent code or a combination thereof. To this end, we propose two unsupervised semantic manipulation strategies: one based on vector-disentangled representation and the other on feature map-disentangled representation, taking into consideration the complexity of the latent codes and the smoothness of the reconstructed images. Our empirical evaluations, conducted extensively on real-world image data, showcase the potency of our attacks, particularly against black-box classifiers. Furthermore, we establish the existence of a universal semantic adversarial example that is agnostic to specific images.
引用
收藏
页码:17070 / 17084
页数:15
相关论文
共 39 条
[11]  
Metzen JH, 2017, Arxiv, DOI arXiv:1702.04267
[12]   Semantic Adversarial Examples [J].
Hosseini, Hossein ;
Poovendran, Radha .
PROCEEDINGS 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2018, :1695-1700
[13]   Towards Compositional Adversarial Robustness: Generalizing Adversarial Training to Composite Semantic Perturbations [J].
Hsiung, Lei ;
Tsai, Yun-Yun ;
Chen, Pin-Yu ;
Ho, Tsung-Yi .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :24658-24667
[14]   Multimodal Unsupervised Image-to-Image Translation [J].
Huang, Xun ;
Liu, Ming-Yu ;
Belongie, Serge ;
Kautz, Jan .
COMPUTER VISION - ECCV 2018, PT III, 2018, 11207 :179-196
[15]  
Goodfellow IJ, 2015, Arxiv, DOI arXiv:1412.6572
[16]  
Kim H, 2019, Arxiv, DOI arXiv:1802.05983
[17]  
Kumar A., 2017, P ADV NEUR INF PROC
[18]  
Larsen ABL, 2016, Arxiv, DOI [arXiv:1512.09300, DOI 10.48550/ARXIV.1512.09300]
[19]   Gradient-based learning applied to document recognition [J].
Lecun, Y ;
Bottou, L ;
Bengio, Y ;
Haffner, P .
PROCEEDINGS OF THE IEEE, 1998, 86 (11) :2278-2324
[20]   Object recognition with gradient-based learning [J].
LeCun, Y ;
Haffner, P ;
Bottou, L ;
Bengio, Y .
SHAPE, CONTOUR AND GROUPING IN COMPUTER VISION, 1999, 1681 :319-345