On the Robustness of Bayesian Neural Networks to Adversarial Attacks

被引:1
|
作者
Bortolussi, Luca [1 ]
Carbone, Ginevra [2 ]
Laurenti, Luca [3 ]
Patane, Andrea [4 ]
Sanguinetti, Guido [5 ,6 ]
Wicker, Matthew [7 ]
机构
[1] Univ Trieste, Dept Math Informat & Geosci, Trieste, Italy
[2] Univ Trieste, Dept Math & Geosci, I-34128 Trieste, Italy
[3] TU Delft Univ, Delft Ctr Syst & Control, NL-2628 CN Delft, Netherlands
[4] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin D02 PN40, Ireland
[5] Scuola Int Super Studi Avanzati, SISSA, I-34136 Trieste, Italy
[6] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Scotland
[7] Univ Oxford, Dept Comp Sci, Oxford OX1 3QG, England
关键词
Training; Adversarial attacks; adversarial robustness; Bayesian inference; Bayesian neural networks (BNNs); GAUSSIAN PROCESS;
D O I
10.1109/TNNLS.2024.3386642
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this article, we analyse the geometry of adversarial attacks in the over-parameterized limit for Bayesian neural networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lie on a lower dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit, BNN posteriors are robust to gradient-based adversarial attacks. Crucially, by relying on the convergence of infinitely-wide BNNs to Gaussian processes (GPs), we prove that, under certain relatively mild assumptions, the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each NN sampled from the BNN posterior does not have vanishing gradients. The experimental results on the MNIST, Fashion MNIST, and a synthetic dataset with BNNs trained with Hamiltonian Monte Carlo and variational inference support this line of arguments, empirically showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free adversarial attacks.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [1] Robustness of Bayesian Neural Networks to White-Box Adversarial Attacks
    Uchendu, Adaku
    Campoy, Daniel
    Menart, Christopher
    Hildenbrandt, Alexandra
    2021 IEEE FOURTH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND KNOWLEDGE ENGINEERING (AIKE 2021), 2021, : 72 - 80
  • [2] Adversarial Robustness Certification for Bayesian Neural Networks
    Wicker, Matthew
    Platzer, Andre
    Laurenti, Luca
    Kwiatkowska, Marta
    FORMAL METHODS, PT I, FM 2024, 2025, 14933 : 3 - 28
  • [3] Robustness Against Adversarial Attacks in Neural Networks Using Incremental Dissipativity
    Aquino, Bernardo
    Rahnama, Arash
    Seiler, Peter
    Lin, Lizhen
    Gupta, Vijay
    IEEE CONTROL SYSTEMS LETTERS, 2022, 6 : 2341 - 2346
  • [4] Improving Robustness Against Adversarial Attacks with Deeply Quantized Neural Networks
    Ayaz, Ferheen
    Zakariyya, Idris
    Cano, Jose
    Keoh, Sye Loong
    Singer, Jeremy
    Pau, Danilo
    Kharbouche-Harrari, Mounia
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [5] Robustness of Sparsely Distributed Representations to Adversarial Attacks in Deep Neural Networks
    Sardar, Nida
    Khan, Sundas
    Hintze, Arend
    Mehra, Priyanka
    ENTROPY, 2023, 25 (06)
  • [6] Robustness and Transferability of Adversarial Attacks on Different Image Classification Neural Networks
    Smagulova, Kamilya
    Bacha, Lina
    Fouda, Mohammed E.
    Kanj, Rouwaida
    Eltawil, Ahmed
    ELECTRONICS, 2024, 13 (03)
  • [7] Improving adversarial robustness of Bayesian neural networks via multi-task adversarial training
    Chen, Xu
    Liu, Chuancai
    Zhao, Yue
    Jia, Zhiyang
    Jin, Ge
    INFORMATION SCIENCES, 2022, 592 : 156 - 173
  • [8] Not So Robust after All: Evaluating the Robustness of Deep Neural Networks to Unseen Adversarial Attacks
    Garaev, Roman
    Rasheed, Bader
    Khan, Adil Mehmood
    ALGORITHMS, 2024, 17 (04)
  • [9] AdvQuNN: A Methodology for Analyzing the Adversarial Robustness of Quanvolutional Neural Networks
    El Maouaki, Walid
    Marchisio, Alberto
    Said, Taoufik
    Bennai, Mohamed
    Shafique, Muhammad
    2024 IEEE INTERNATIONAL CONFERENCE ON QUANTUM SOFTWARE, IEEE QSW 2024, 2024, : 175 - 181
  • [10] Adversarial robustness improvement for deep neural networks
    Charis Eleftheriadis
    Andreas Symeonidis
    Panagiotis Katsaros
    Machine Vision and Applications, 2024, 35