Adversarial Examples Detection and Analysis with Layer-wise Autoencoders

被引：4

作者：

Wojcik, Bartosz ^{[1
]}

Morawiecki, Pawel ^{[2
]}

Smieja, Marek ^{[1
]}

Krzyzek, Tomasz ^{[1
]}

Spurek, Przemyslaw ^{[1
]}

Tabor, Jacek ^{[1
]}

机构：

[1] Jagiellonian Univ, Fac Math & Comp Sci, Lojasiewicza 6, PL-30348 Krakow, Poland

[2] Polish Acad Sci, Inst Comp Sci, Jana Kazimierza 5, PL-01248 Warsaw, Poland

来源：

2021 IEEE 33RD INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2021) | 2021年

关键词：

adversarial examples; adversarial attack detection; adversarial noise; robustness; neural networks safety; trustworthy machine learning;

D O I：

10.1109/ICTAI52525.2021.00209

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a mechanism for detecting adversarial examples based on data representations taken from the hidden layers of the target network. Individual autoencoders at intermediate layers of the target network are trained for this purpose. This describes the manifold of true data and, in consequence, can be used to classify whether a given example has the same characteristics as true data. It also gives insight into the behavior of adversarial examples and their flow through the layers of a deep neural network. Experimental results show that our method outperforms the state of the art in supervised and unsupervised settings.

引用

页码：1322 / 1326

页数：5

共 30 条

[1]

Amodei D, 2016, PR MACH LEARN RES, V48

[2]

[Anonymous], 2015, ACS SYM SER

[3] Towards Evaluating the Robustness of Neural Networks [J].

Carlini, Nicholas ;

Wagner, David .

2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, :39-57

[4]

Carlini Nicholas, 2017, Provably minimally-distorted adversarial examples

[5]

Fefferman C., 2013, Testing the manifold hypothesis

[6]

Goodfellow Ian J., 2014, INT C LEARNING REPRE

[7]

Grosse K., 2017, ARXIV

[8] Adversarial Examples for Malware Detection [J].

Grosse, Kathrin ;

Papernot, Nicolas ;

Manoharan, Praveen ;

Backes, Michael ;

McDaniel, Patrick .

COMPUTER SECURITY - ESORICS 2017, PT II, 2017, 10493 :62-79

[9] Deep Residual Learning for Image Recognition [J].

He, Kaiming ;

Zhang, Xiangyu ;

Ren, Shaoqing ;

Sun, Jian .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778

[10] Densely Connected Convolutional Networks [J].

Huang, Gao ;

Liu, Zhuang ;

van der Maaten, Laurens ;

Weinberger, Kilian Q. .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269

← 1 2 3 →