Attention-guided transformation-invariant attack for black-box adversarial examples

被引:5
作者
Zhu, Jiaqi [1 ]
Dai, Feng [2 ]
Yu, Lingyun [1 ,3 ]
Xie, Hongtao [1 ]
Wang, Lidong [4 ]
Wu, Bo [5 ]
Zhang, Yongdong [1 ]
机构
[1] Univ Sci & Technol China, Sch Informat Sci & Technol, 443 Huangshan Rd, Hefei 230027, Peoples R China
[2] Chinese Acad Sci, Key Lab Intelligent Informat Proc, Beijing, Peoples R China
[3] Hefei Comprehens Natl Sci Ctr, Inst Artificial Intelligence, Hefei, Peoples R China
[4] Beijing Radio & TV Stn, Beijing, Peoples R China
[5] MIT IBM Watson AI Lab, Cambridge, MA USA
基金
中国国家自然科学基金;
关键词
adversarial examples; attention; media convergence; security; transformation-invariant;
D O I
10.1002/int.22808
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
With the development of media convergence, information acquisition is no longer limited to traditional media, such as newspapers and televisions, but more from digital media on the Internet, where media contents should be under supervision by platforms. At present, the media content analysis technology of Internet platforms relies on deep neural networks (DNNs). However, DNNs show vulnerability to adversarial examples, which results in security risks. Therefore, it is necessary to adequately study the internal mechanism of adversarial examples to build more effective supervision models. When coming to practical applications, supervision models are mostly faced with black-box attacks, where cross-model transferability of adversarial examples has attracted increasing attention. In this paper, to improve the transferability of adversarial examples, we propose an attention-guided transformation-invariant adversarial attack method, which incorporates an attention mechanism to disrupt the most distinctive features and simultaneously ensures adversarial attack invariance under different transformations. Specifically, we dynamically weight the latent features according to an attention mechanism and disrupt them accordingly. Meanwhile, considering the lack of semantics in low-level features, high-level semantics are introduced as spatial guidance to make low-level feature perturbations concentrate on the most discriminative regions. Moreover, since the attention heatmaps may vary significantly across different models, a transformation-invariant aggregated attack strategy is proposed to alleviate overfitting to the proxy model attention. Comprehensive experimental results show that the proposed method can significantly improve the transferability of adversarial examples.
引用
收藏
页码:3142 / 3165
页数:24
相关论文
共 58 条
  • [1] Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey
    Akhtar, Naveed
    Mian, Ajmal
    [J]. IEEE ACCESS, 2018, 6 : 14410 - 14430
  • [2] [Anonymous], 2017, ARXIV
  • [3] Practical Black-Box Attacks on Deep Neural Networks Using Efficient Query Mechanisms
    Bhagoji, Arjun Nitin
    He, Warren
    Li, Bo
    Song, Dawn
    [J]. COMPUTER VISION - ECCV 2018, PT XII, 2018, 11216 : 158 - 174
  • [4] Brendel W., 2018, 6 INT C LEARN REPR
  • [5] Towards Evaluating the Robustness of Neural Networks
    Carlini, Nicholas
    Wagner, David
    [J]. 2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, : 39 - 57
  • [6] Boosting Adversarial Attacks with Momentum
    Dong, Yinpeng
    Liao, Fangzhou
    Pang, Tianyu
    Su, Hang
    Zhu, Jun
    Hu, Xiaolin
    Li, Jianguo
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 9185 - 9193
  • [7] Duan P., 2020, MEDIA CONVERGENCE DE, P145
  • [8] Erdal John I., 2017, NORDICOM REV, V28, P51
  • [9] Escalera S., 2017, NIPS 17 COMPETITION
  • [10] FDA: Feature Disruptive Attack
    Ganeshan, Aditya
    Vivek, B. S.
    Babu, R. Venkatesh
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 8068 - 8078