On the Robustness of Bayesian Neural Networks to Adversarial Attacks

被引:1
|
作者
Bortolussi, Luca [1 ]
Carbone, Ginevra [2 ]
Laurenti, Luca [3 ]
Patane, Andrea [4 ]
Sanguinetti, Guido [5 ,6 ]
Wicker, Matthew [7 ]
机构
[1] Univ Trieste, Dept Math Informat & Geosci, Trieste, Italy
[2] Univ Trieste, Dept Math & Geosci, I-34128 Trieste, Italy
[3] TU Delft Univ, Delft Ctr Syst & Control, NL-2628 CN Delft, Netherlands
[4] Trinity Coll Dublin, Sch Comp Sci & Stat, Dublin D02 PN40, Ireland
[5] Scuola Int Super Studi Avanzati, SISSA, I-34136 Trieste, Italy
[6] Univ Edinburgh, Sch Informat, Edinburgh EH8 9AB, Scotland
[7] Univ Oxford, Dept Comp Sci, Oxford OX1 3QG, England
关键词
Training; Adversarial attacks; adversarial robustness; Bayesian inference; Bayesian neural networks (BNNs); GAUSSIAN PROCESS;
D O I
10.1109/TNNLS.2024.3386642
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Vulnerability to adversarial attacks is one of the principal hurdles to the adoption of deep learning in safety-critical applications. Despite significant efforts, both practical and theoretical, training deep learning models robust to adversarial attacks is still an open problem. In this article, we analyse the geometry of adversarial attacks in the over-parameterized limit for Bayesian neural networks (BNNs). We show that, in the limit, vulnerability to gradient-based attacks arises as a result of degeneracy in the data distribution, i.e., when the data lie on a lower dimensional submanifold of the ambient space. As a direct consequence, we demonstrate that in this limit, BNN posteriors are robust to gradient-based adversarial attacks. Crucially, by relying on the convergence of infinitely-wide BNNs to Gaussian processes (GPs), we prove that, under certain relatively mild assumptions, the expected gradient of the loss with respect to the BNN posterior distribution is vanishing, even when each NN sampled from the BNN posterior does not have vanishing gradients. The experimental results on the MNIST, Fashion MNIST, and a synthetic dataset with BNNs trained with Hamiltonian Monte Carlo and variational inference support this line of arguments, empirically showing that BNNs can display both high accuracy on clean data and robustness to both gradient-based and gradient-free adversarial attacks.
引用
收藏
页码:1 / 14
页数:14
相关论文
共 50 条
  • [21] Adversarial Robustness of Vision Transformers Versus Convolutional Neural Networks
    Ali, Kazim
    Bhatti, Muhammad Shahid
    Saeed, Atif
    Athar, Atifa
    Al Ghamdi, Mohammed A.
    Almotiri, Sultan H.
    Akram, Samina
    IEEE ACCESS, 2024, 12 : 105281 - 105293
  • [22] Adversarial Robustness of Multi-bit Convolutional Neural Networks
    Frickenstein, Lukas
    Sampath, Shambhavi Balamuthu
    Mori, Pierpaolo
    Vemparala, Manoj-Rohit
    Fasfous, Nael
    Frickenstein, Alexander
    Unger, Christian
    Passerone, Claudio
    Stechele, Walter
    INTELLIGENT SYSTEMS AND APPLICATIONS, VOL 3, INTELLISYS 2023, 2024, 824 : 157 - 174
  • [23] Vulnerable point detection and repair against adversarial attacks for convolutional neural networks
    Jie Gao
    Zhaoqiang Xia
    Jing Dai
    Chen Dang
    Xiaoyue Jiang
    Xiaoyi Feng
    International Journal of Machine Learning and Cybernetics, 2023, 14 : 4163 - 4192
  • [24] SENTINEL: Securing Indoor Localization Against Adversarial Attacks With Capsule Neural Networks
    Gufran, Danish
    Anandathirtha, Pooja
    Pasricha, Sudeep
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (11) : 4021 - 4032
  • [25] A Geometrical Approach to Evaluate the Adversarial Robustness of Deep Neural Networks
    Wang, Yang
    Dong, Bo
    Xu, Ke
    Piao, Haiyin
    Ding, Yufei
    Yin, Baocai
    Yang, Xin
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (05)
  • [26] Towards Improving Robustness of Deep Neural Networks to Adversarial Perturbations
    Amini, Sajjad
    Ghaemmaghami, Shahrokh
    IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (07) : 1889 - 1903
  • [27] Vulnerable point detection and repair against adversarial attacks for convolutional neural networks
    Gao, Jie
    Xia, Zhaoqiang
    Dai, Jing
    Dang, Chen
    Jiang, Xiaoyue
    Feng, Xiaoyi
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2023, 14 (12) : 4163 - 4192
  • [28] RobCaps: Evaluating the Robustness of Capsule Networks against Affine Transformations and Adversarial Attacks
    Marchisio, Alberto
    De Marco, Antonio
    Colucci, Alessio
    Martina, Maurizio
    Shafique, Muhammad
    2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
  • [29] Increasing-Margin Adversarial (IMA) training to improve adversarial robustness of neural networks
    Ma, Linhai
    Liang, Liang
    COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, 2023, 240
  • [30] Hardware Accelerator for Adversarial Attacks on Deep Learning Neural Networks
    Guo, Haoqiang
    Peng, Lu
    Zhang, Jian
    Qi, Fang
    Duan, Lide
    2019 TENTH INTERNATIONAL GREEN AND SUSTAINABLE COMPUTING CONFERENCE (IGSC), 2019,