Inspecting adversarial examples using the fisher information

被引:13
作者
Martin, Joerg [1 ]
Elster, Clemens [2 ]
机构
[1] Phys Tech Bundesanstalt, Data Anal Grp, Abbestr 2, D-10587 Berlin, Germany
[2] Phys Tech Bundesanstalt, Abbestr 2, D-10587 Berlin, Germany
关键词
Deep Learning; Adversarial Examples; Fisher information; Explainability; MODELS;
D O I
10.1016/j.neucom.2019.11.052
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Adversarial examples are constructed by slightly perturbing a correctly processed input to a trained neural network such that the network produces an incorrect result. This work proposes the usage of the Fisher information for the detection of such adversarial attacks. We discuss various quantities whose computation scales well with the network size, study their behavior on adversarial examples and show how they can highlight the importance of single input neurons, thereby providing a visual tool for further analyzing the behavior of a neural network. The potential of our methods is demonstrated by applications to the MNIST, CIFARI0 and Fruits-360 datasets and through comparison to concurring methods. (C) 2019 Elsevier B.V. All rights reserved.
引用
收藏
页码:80 / 86
页数:7
相关论文
共 44 条
[1]   Hybrid no-propagation learning for multilayer neural networks [J].
Adhikari, Shyam Prasad ;
Yang, Changju ;
Slot, Krzysztof ;
Strzelecki, Michal ;
Kim, Hyongsuk .
NEUROCOMPUTING, 2018, 321 :28-35
[2]   Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey [J].
Akhtar, Naveed ;
Mian, Ajmal .
IEEE ACCESS, 2018, 6 :14410-14430
[3]  
Amodei Dario, 2016, CONCRETE PROBLEMS AI
[4]  
[Anonymous], IEEE T DEPENDABLE SE
[5]  
[Anonymous], ARXIV190602494
[6]  
[Anonymous], 2018, DENSITY ESTIMATION S, DOI DOI 10.1201/9781315140919
[7]  
[Anonymous], 2015, INT C LEARN REPR ICL
[8]  
[Anonymous], 2010, AT T LABS
[9]  
[Anonymous], 2015, ABS150204156 CORR
[10]  
[Anonymous], 2017, ICLR