Improving the Antinoise Ability of DNNs via a Bio-Inspired Noise Adaptive Activation Function Rand Softplus

被引:18
作者
Chen, Yunhua [1 ]
Mai, Yingchao [1 ]
Xiao, Jinsheng [2 ]
Zhang, Ling [1 ]
机构
[1] Guangdong Univ Technol, Sch Comp, Guangzhou 51006, Guangdong, Peoples R China
[2] Wuhan Univ, Sch Elect Informat, Wuhan 430072, Hubei, Peoples R China
基金
中国国家自然科学基金;
关键词
25;
D O I
10.1162/neco_a_01192
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although deep neural networks (DNNs) have led to many remarkable results in cognitive tasks, they are still far from catching up with human-level cognition in antinoise capability. New research indicates how brittle and susceptible current models are to small variations in data distribution. In this letter, we study the stochasticity-resistance character of biological neurons by simulating the input-output response process of a leaky integrate-and-fire (LIF) neuron model and proposed a novel activation function, rand softplus (RSP), to model the response process. In RSP, a scale factor eta is employed to mimic the stochasticity-adaptability of biological neurons, thereby enabling the antinoise capability of a DNN to be improved by the novel activation function. We validated the performance of RSP with a 19-layer residual network (ResNet) and a 19-layer visual geometry group (VGG) on facial expression recognition data sets and compared it with other popular activation functions, such as rectified linear units (ReLU), softplus, leaky ReLU (LReLU), exponential linear unit (ELU), and noisy softplus (NSP). The experimental results show that RSP is applied to VGG-19 or ResNet-19, and the average recognition accuracy under five different noise levels exceeds the other functions on both of the two facial expression data sets; in other words, RSP outperforms the other activation functions in noise resistance. Compared with the application in ResNet-19, the application of RSP in VGG-19 can improve a network's antinoise performance to a greater extent. In addition, RSP is easier to train compared to NSP because it has only one parameter to be calculated automatically according to the input data. Therefore, this work provides the deep learning community with a novel activation function that can better deal with overfitting problems.
引用
收藏
页码:1215 / 1233
页数:19
相关论文
共 25 条
  • [1] [Anonymous], ARXIV180600451V1
  • [2] [Anonymous], P 3 INT C LEARNING R
  • [3] [Anonymous], 2010, MNIST HANDWRITTEN DI
  • [4] [Anonymous], PROC CVPR IEEE
  • [5] [Anonymous], 2017, ARXIV170603609
  • [6] [Anonymous], 2008, 2008 8 IEEE INT C
  • [7] [Anonymous], 2014, DEEPLY LEARNING DEFO
  • [8] [Anonymous], 2013, INT C MACH LEARN ATL
  • [9] [Anonymous], 2015, arXiv preprint arXiv:1510.08829
  • [10] An energy budget for signaling in the grey matter of the brain
    Attwell, D
    Laughlin, SB
    [J]. JOURNAL OF CEREBRAL BLOOD FLOW AND METABOLISM, 2001, 21 (10) : 1133 - 1145