Adversarial Robustness on In- and Out-Distribution Improves Explainability

被引：27

作者：

Augustin, Maximilian ^{[1
]}

Meinke, Alexander ^{[1
]}

Hein, Matthias ^{[1
]}

机构：

[1] Univ Tubingen, Tubingen, Germany

来源：

COMPUTER VISION - ECCV 2020, PT XXVI | 2020年 / 12371卷

关键词：

D O I：

10.1007/978-3-030-58574-7_14

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Neural networks have led to major improvements in image classification but suffer from being non-robust to adversarial changes, unreliable uncertainty estimates on out-distribution samples and their inscrutable black-box decisions. In this work we propose RATIO, a training procedure for Robustness via Adversarial Training on In- and Out-distribution, which leads to robust models with reliable and robust confidence estimates on the out-distribution. RATIO has similar generative properties to adversarial training so that visual counterfactuals produce class specific features. While adversarial training comes at the price of lower clean accuracy, RATIO achieves state-of-the-art l(2)-adversarial robustness on CIFAR10 and maintains better clean accuracy.

引用

页码：228 / 245

页数：18

共 60 条

[1] Square Attack: A Query-Efficient Black-Box Adversarial Attack via Random Search [J].

Andriushchenko, Maksym ;

Croce, Francesco ;

Flammarion, Nicolas ;

Hein, Matthias .

COMPUTER VISION - ECCV 2020, PT XXIII, 2020, 12368 :484-501

[2]

Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640

[3]

Athalye A, 2018, PR MACH LEARN RES, V80

[4] On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation [J].

Bach, Sebastian ;

Binder, Alexander ;

Montavon, Gregoire ;

Klauschen, Frederick ;

Mueller, Klaus-Robert ;

Samek, Wojciech .

PLOS ONE, 2015, 10 (07)

[5]

Baehrens D, 2010, J MACH LEARN RES, V11, P1803

[6] The Hidden Assumptions Behind Counterfactual Explanations and Principal Reasons [J].

Barocas, Solon ;

Selbst, Andrew D. ;

Raghavan, Manish .

FAT* '20: PROCEEDINGS OF THE 2020 CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, 2020, :80-89

[7]

Bitterwolf J, 2021, Arxiv, DOI arXiv:2007.08473

[8]

Carlini N, 2017, PROCEEDINGS OF THE 10TH ACM WORKSHOP ON ARTIFICIAL INTELLIGENCE AND SECURITY, AISEC 2017, P3, DOI 10.1145/3128572.3140444

[9]

Carmon Y, 2019, ADV NEUR IN, V32

[10]

Chang C. -H., 2019, INT C LEARN REPR

← 1 2 3 4 5 6 →