On the Robustness of Bayesian Neural Networks to Adversarial Attacks

被引:1
|
作者
Bortolussi, Luca [1 ]
Carbone, Ginevra [2 ]
Laurenti, Luca [3 ]
Patane, Andrea [4 ]
Sanguinetti, Guido [5 ,6 ]
Wicker, Matthew [7 ]
机构
[1] Univ Trieste, Dept Math Informat & Geosci, Trieste, Italy
[2] Univ Trieste, Dept Math & Geosci, I-34128 Trieste, Italy
[3] TU Delft Univ, Delft Ctr Syst & Control, NL-2628 CN Delft, Netherlands
[4] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin D02 PN40, Ireland
[5] Scuola Int Super Studi Avanzati, SISSA, I-34136 Trieste, Italy
[6] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Scotland
[7] Univ Oxford, Dept Comp Sci, Oxford OX1 3QG, England
关键词
Training; Adversarial attacks; adversarial robustness; Bayesian inference; Bayesian neural networks (BNNs); GAUSSIAN PROCESS;
D O I
10.1109/TNNLS.2024.3386642
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this article, we analyse the geometry of adversarial attacks in the over-parameterized limit for Bayesian neural networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lie on a lower dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit, BNN posteriors are robust to gradient-based adversarial attacks. Crucially, by relying on the convergence of infinitely-wide BNNs to Gaussian processes (GPs), we prove that, under certain relatively mild assumptions, the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each NN sampled from the BNN posterior does not have vanishing gradients. The experimental results on the MNIST, Fashion MNIST, and a synthetic dataset with BNNs trained with Hamiltonian Monte Carlo and variational inference support this line of arguments, empirically showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free adversarial attacks.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [31] A Comprehensive Analysis on Adversarial Robustness of Spiking Neural Networks
    Sharmin, Saima
    Panda, Priyadarshini
    Sarwar, Syed Shakib
    Lee, Chankyu
    Ponghiran, Wachirawit
    Roy, Kaushik
    2019 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2019,
  • [32] Evaluating Accuracy and Adversarial Robustness of Quanvolutional Neural Networks
    Sooksatra, Korn
    Rivas, Pablo
    Orduz, Javier
    2021 INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND COMPUTATIONAL INTELLIGENCE (CSCI 2021), 2021, : 152 - 157
  • [33] Adversarial Robustness Guarantees for Random Deep Neural Networks
    De Palma, Giacomo
    Kiani, Bobak T.
    Lloyd, Seth
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139
  • [34] NON-SINGULAR ADVERSARIAL ROBUSTNESS OF NEURAL NETWORKS
    Tsai, Yu-Lin
    Hsu, Chia-Yi
    Yu, Chia-Mu
    Chen, Pin-Yu
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3840 - 3844
  • [35] Towards Demystifying Adversarial Robustness of Binarized Neural Networks
    Qin, Zihao
    Lin, Hsiao-Ying
    Shi, Jie
    APPLIED CRYPTOGRAPHY AND NETWORK SECURITY WORKSHOPS, ACNS 2021, 2021, 12809 : 439 - 462
  • [36] Towards Proving the Adversarial Robustness of Deep Neural Networks
    Katz, Guy
    Barrett, Clark
    Dill, David L.
    Julian, Kyle
    Kochenderfer, Mykel J.
    ELECTRONIC PROCEEDINGS IN THEORETICAL COMPUTER SCIENCE, 2017, (257): : 19 - 26
  • [37] Defending Against Adversarial Attacks in Deep Neural Networks
    You, Suya
    Kuo, C-C Jay
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING FOR MULTI-DOMAIN OPERATIONS APPLICATIONS, 2019, 11006
  • [38] Adversarial Attacks on Graph Neural Networks: Perturbations and their Patterns
    Zuegner, Daniel
    Borchert, Oliver
    Akbarnejad, Amir
    Guennemann, Stephan
    ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA, 2020, 14 (05)
  • [39] Adversarial Attacks in Modulation Recognition With Convolutional Neural Networks
    Lin, Yun
    Zhao, Haojun
    Ma, Xuefei
    Tu, Ya
    Wang, Meiyu
    IEEE TRANSACTIONS ON RELIABILITY, 2021, 70 (01) : 389 - 401
  • [40] Uncertainty-Aware SAR ATR: Defending Against Adversarial Attacks via Bayesian Neural Networks
    Ye, Tian
    Kannan, Rajgopal
    Prasanna, Viktor
    Busart, Carl
    2024 IEEE RADAR CONFERENCE, RADARCONF 2024, 2024,