SPLASH: Learnable activation functions for improving accuracy and adversarial robustness

被引：18

作者：

Tavakoli, Mohammadamin ^{[1
]}

Agostinelli, Forest ^{[2
]}

Baldi, Pierre ^{[1
]}

机构：

[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA USA

[2] Univ South Carolina, Dept Comp Sci & Engn, Columbia, SC 29208 USA

来源：

NEURAL NETWORKS | 2021年 / 140卷

基金：

美国国家科学基金会;

关键词：

Activation; Neural networks; Accuracy; Robustness; Adversarial; NEURAL-NETWORKS;

D O I：

10.1016/j.neunet.2021.02.023

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We introduce SPLASH units, a class of learnable activation functions shown to simultaneously improve the accuracy of deep neural networks while also improving their robustness to adversarial attacks. SPLASH units have both a simple parameterization and maintain the ability to approximate a wide range of non-linear functions. SPLASH units are: (1) continuous; (2) grounded (f (0) = 0); (3) use symmetric hinges; and (4) their hinges are placed at fixed locations which are derived from the data (i.e. no learning required). Compared to nine other learned and fixed activation functions, including ReLU and its variants, SPLASH units show superior performance across three datasets (MNIST, CIFAR-10, and CIFAR-100) and four architectures (LeNet5, All-CNN, ResNet-20, and Network-in-Network). Furthermore, we show that SPLASH units significantly increase the robustness of deep neural networks to adversarial attacks. Our experiments on both black-box and white-box adversarial attacks show that commonly-used architectures, namely LeNet5, All-CNN, Network-in-Network, and ResNet-20, can be up to 31% more robust to adversarial attacks by simply using SPLASH units instead of ReLUs. Finally, we show the benefits of using SPLASH activation functions in bigger architectures designed for non-trivial datasets such as ImageNet. (C) 2021 Elsevier Ltd. All rights reserved.

引用

页码：1 / 12

页数：12

共 59 条

[1] Nguyen A, 2015, PROC CVPR IEEE, P427, DOI 10.1109/CVPR.2015.7298640
[2] [Anonymous], 2001, M.Sc. thesis
[3] [Anonymous], 2017, P ACM WORKSH ART INT, DOI DOI 10.1145/3128572.3140449
[4] [Anonymous], 2017, P REL MACH LEARN WIL, DOI DOI 10.21105/JOSS.02607
[5] [Anonymous], 2015, IEEE I CONF COMP VIS, DOI DOI 10.1109/ICCV.2015.123
[6] Baldi P, 2015, INT C LEARN REPR WOR
[7] Baldi P., 2021, DEEP LEARNING SCI
[8] Brendel W., 2017, arXiv:1712.04248.
[9] Towards Evaluating the Robustness of Neural Networks
Carlini, Nicholas
Wagner, David
[J]. 2017 IEEE SYMPOSIUM ON SECURITY AND PRIVACY (SP), 2017, : 39 - 57
[10] Chollet F., 2015, KERAS 20 COMPUTER SO

← 1 2 3 4 5 6 →