Enhancing batch normalized convolutional networks using displaced rectifier linear units: A systematic comparative study

被引:26
作者
Macedo, David [1 ]
Zanchettin, Cleber [1 ]
Oliveira, Adriano L., I [1 ]
Ludermir, Teresa [1 ]
机构
[1] Univ Fed Pernambuco, Ctr Informat, Av Jornalista Anibal Fernandes S-N, BR-50670901 Recife, PE, Brazil
关键词
DReLU; Activation function; Batch normalization; Comparative study; Convolutional Neural Networks; Deep learning; NEURAL-NETWORK; SENTIMENT ANALYSIS; DEEP;
D O I
10.1016/j.eswa.2019.01.066
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A substantial number of expert and intelligent systems rely on deep learning methods to solve problems in areas such as economics, physics, and medicine. Improving the accuracy of the activation functions used by such methods can directly and positively impact the overall performance and quality of the mentioned systems at no cost whatsoever. In this sense, enhancing the design of such theoretical fundamental blocks is of great significance as it immediately impacts a broad range of current and future real-world deep learning based applications. Therefore, in this paper, we turn our attention to the inter working between the activation functions and the batch normalization, which is practically a mandatory technique to train deep networks currently. We propose the activation function Displaced Rectifier Linear Unit (DReLU) by conjecturing that extending the identity function of ReLU to the third quadrant enhances compatibility with batch normalization. Moreover, we used statistical tests to compare the impact of using distinct activation functions (ReLU, LReLU, PReLU, ELU, and DReLU) on the learning speed and test accuracy performance of standardized VGG and Residual Networks state-of-the-art models. These Convolutional Neural Networks were trained on CIFAR-100 and CIFAR-10, the most commonly used deep learning computer vision datasets. The results showed DReLU speeded up learning in all models and datasets. Besides, statistical significant performance assessments (p < 0.05) showed DReLU enhanced the test accuracy presented by ReLU in all scenarios. Furthermore, DReLU showed better test accuracy than any other tested activation function in all experiments with one exception, in which case it presented the second best performance. Therefore, this work demonstrates that it is possible to increase performance replacing ReLU by an enhanced activation function. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页码:271 / 281
页数:11
相关论文
共 42 条
[1]   Exudate detection for diabetic retinopathy with circular Hough transformation and convolutional neural networks [J].
Adem, Kemal .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 114 :289-295
[2]   Natural gradient works efficiently in learning [J].
Amari, S .
NEURAL COMPUTATION, 1998, 10 (02) :251-276
[3]  
[Anonymous], 2017, DIRACNETS TRAINING V
[4]   Enhancing deep learning sentiment analysis with ensemble techniques in social applications [J].
Araque, Oscar ;
Corcuera-Platas, Ignacio ;
Sanchez-Rada, J. Fernando ;
Iglesias, Carlos A. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 77 :236-246
[5]   Deep learning with adaptive learning rate using laplacian score [J].
Chandra, B. ;
Sharma, Rajesh K. .
EXPERT SYSTEMS WITH APPLICATIONS, 2016, 63 :1-7
[6]   Forecasting stock market crisis events using deep and statistical machine learning techniques [J].
Chatzis, Sotirios P. ;
Siakoulis, Vassilis ;
Petropoulos, Anastasios ;
Stavroulakis, Evangelos ;
Vlachogiannakis, Nikos .
EXPERT SYSTEMS WITH APPLICATIONS, 2018, 112 :353-371
[7]   Deep learning networks for stock market analysis and prediction: Methodology, data representations, and case studies [J].
Chong, Eunsuk ;
Han, Chulwoo ;
Park, Frank C. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 83 :187-205
[8]  
Clevert D.-A., 2016, 4 INT C LEARN REPR I
[9]  
Conover William Jay, 1979, Technical Report LA-7677-MS, V1, P14, DOI [DOI 10.2172/6057803, 10.2172/6057803]
[10]  
Cunningham R. J., 2017, ARXIV170609450