An Empirical Study on Position of the Batch Normalization Layer in Convolutional Neural Networks

被引:44
作者
Hasani, Moein [1 ]
Khotanlou, Hassan [1 ]
机构
[1] Bu Ali Sina Univ, Dept Comp Engn, Hamadan, Hamadan, Iran
来源
2019 5TH IRANIAN CONFERENCE ON SIGNAL PROCESSING AND INTELLIGENT SYSTEMS (ICSPIS 2019) | 2019年
关键词
convolutional neural networks; batch normalization;
D O I
10.1109/icspis48872.2019.9066113
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we have studied how training of the convolutional neural networks (CNNs) can be affected by changing the position of the batch normalization (BN) layer. Three different convolutional neural networks have been chosen for our experiments. These networks are AlexNet, VGG-16, and ResNet-20. We show that the speed-up provided by the BN algorithm can be further improved by using the BN in positions other than the one suggested by its original paper. Also, we discuss how the BN layer in a certain position can aid the training of one network but not the other. Three different positions for the BN layer have been studied in this research, these positions are: BN layer between the convolution layer and the non-linear activation function, BN layer after the non-linear activation function and finally, the BN layer before each of the convolutional layers.
引用
收藏
页数:4
相关论文
共 13 条
[1]  
Abadi M., 2015, TENSORFLOW LARGE SCA, DOI DOI 10.48550/ARXIV.1603.04467
[2]  
[Anonymous], 2015, ARXIV PREPRINT ARXIV
[3]  
[Anonymous], 2014, Journal of Machine Learning Research, DOI DOI 10.1016/J.MICROMESO.2003.09.025
[4]  
[Anonymous], 2015, Arxiv.Org, DOI DOI 10.3389/FPSYG.2013.00124
[5]   The vanishing gradient problem during learning recurrent neural nets and problem solutions [J].
Hochreiter, S .
INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 1998, 6 (02) :107-116
[6]  
Johnson Stanford., 2015, Tiny imagenet visual recognition challenge
[7]  
Kingma DP, 2014, ADV NEUR IN, V27
[8]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[9]  
LeCun Yann, 2015, Lenet-5, convolutional neural networks
[10]   Microsoft COCO: Common Objects in Context [J].
Lin, Tsung-Yi ;
Maire, Michael ;
Belongie, Serge ;
Hays, James ;
Perona, Pietro ;
Ramanan, Deva ;
Dollar, Piotr ;
Zitnick, C. Lawrence .
COMPUTER VISION - ECCV 2014, PT V, 2014, 8693 :740-755