Defending Against Universal Attacks Through Selective Feature Regeneration

被引:26
作者
Borkar, Tejas [1 ]
Heide, Felix [2 ,3 ]
Karam, Lina [1 ,4 ]
机构
[1] Arizona State Univ, Tempe, AZ 85287 USA
[2] Princeton Univ, Princeton, NJ 08544 USA
[3] Algolux, Montreal, PQ, Canada
[4] Lebanese Amer Univ, Beirut, Lebanon
来源
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2020年
关键词
ROBUSTNESS;
D O I
10.1109/CVPR42600.2020.00079
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep neural network (DNN) predictions have been shown to be vulnerable to carefully crafted adversarial perturbations. Specifically, image-agnostic (universal adversarial) perturbations added to any image can fool a target network into making erroneous predictions. Departing from existing defense strategies that work mostly in the image domain, we present a novel defense which operates in the DNN feature domain and effectively defends against such universal perturbations. Our approach identifies pre-trained convolutional features that are most vulnerable to adversarial noise and deploys trainable feature regeneration units which transform these DNN filter activations into resilient features that are robust to universal perturbations. Regenerating only the top 50% adversarially susceptible activations in at most 6 DNN layers and leaving all remaining DNN activations unchanged, we outperform existing defense strategies across different network architectures by more than 10% in restored accuracy. We show that without any additional modification, our defense trained on ImageNet with one type of universal attack examples effectively defends against other types of unseen universal attacks.
引用
收藏
页码:706 / 716
页数:11
相关论文
共 66 条
  • [21] Fawzi A., 2017, CoRR
  • [22] The Robustness of Deep Networks A geometrical perspective
    Fawzi, Alhussein
    Moosavi-Dezfooli, Seyed-Mohsen
    Frossard, Pascal
    [J]. IEEE SIGNAL PROCESSING MAGAZINE, 2017, 34 (06) : 50 - 62
  • [23] Goodfellow Ian J., 2015, 3 INT C LEARN REPR I
  • [24] Guo Chuan, 2017, ARXIV171100117
  • [25] Deep Residual Learning for Image Recognition
    He, Kaiming
    Zhang, Xiangyu
    Ren, Shaoqing
    Sun, Jian
    [J]. 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, : 770 - 778
  • [26] Caffe: Convolutional Architecture for Fast Feature Embedding
    Jia, Yangqing
    Shelhamer, Evan
    Donahue, Jeff
    Karayev, Sergey
    Long, Jonathan
    Girshick, Ross
    Guadarrama, Sergio
    Darrell, Trevor
    [J]. PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 675 - 678
  • [27] Kannan H., 2018, Adversarial logit pairing
  • [28] Art of singular vectors and universal adversarial perturbations
    Khrulkov, Valentin
    Oseledets, Ivan
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 8562 - 8570
  • [29] Adversarial examples for generative models
    Kos, Jernej
    Fischer, Ian
    Song, Dawn
    [J]. 2018 IEEE SYMPOSIUM ON SECURITY AND PRIVACY WORKSHOPS (SPW 2018), 2018, : 36 - 42
  • [30] ImageNet Classification with Deep Convolutional Neural Networks
    Krizhevsky, Alex
    Sutskever, Ilya
    Hinton, Geoffrey E.
    [J]. COMMUNICATIONS OF THE ACM, 2017, 60 (06) : 84 - 90