Constraints on Hyper-parameters in Deep Learning Convolutional Neural Networks

被引:0
作者
Al-Saggaf, Ubaid M. [1 ,2 ]
Botalb, Abdelaziz [1 ,2 ]
Faisal, Muhammad [3 ]
Moinuddin, Muhammad [1 ,2 ]
Alsaggaf, Abdulrahman U. [1 ,2 ]
Alfakeh, Sulhi Ali [4 ]
机构
[1] King Abdulaziz Univ, Elect & Comp Engn Dept, Jeddah 21589, Saudi Arabia
[2] King Abdulaziz Univ, CEIES, Jeddah 21589, Saudi Arabia
[3] King Fahd Univ Petr & Minerals, Dammam Community Coll, Comp & Informat Technol Dept, Dhahran 31261, Saudi Arabia
[4] King Abdulaziz Univ, Fac Med, Dept Internal Med, Jeddah 21589, Saudi Arabia
关键词
Neural networks; convolution; pooling; hyper-parameters; CNN; deep learning; zero-padding; stride; backpropagation; OPTIMIZATION; SEARCH;
D O I
10.14569/IJACSA.2022.0131150
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Convolutional Neural Network (CNN), a type of Deep Learning, has a very large number of hyper-meters in contrast to the Artificial Neural Network (ANN) which makes the task of CNN training more demanding. The reason why the task of tuning parameters optimization is difficult in the CNN is the existence of a huge optimization space comprising a large number of hyper-parameters such as the number of layers, number of neurons, number of kernels, stride, padding, rows or columns truncation, parameters of the backpropagation algorithm, etc. Moreover, most of the existing techniques in the literature for the selection of these parameters are based on random practice which is developed for some specific datasets. In this work, we empirically investigated and proved that CNN performance is linked not only to choosing the right hyper-parameters but also to its implementation. More specifically, it is found that the performance is also depending on how it deals when the CNN operations require setting of hyper-parameters that do not symmetrically fit the input volume. We demonstrated two different implementations, crop or pad the input volume to make it fit. Our analysis shows that padding performs better than cropping in terms of prediction accuracy (85.58% in contrast to 82.62%) while takes lesser training time (8 minutes lesser).
引用
收藏
页码:439 / 449
页数:11
相关论文
共 25 条
[1]   Either crop or pad the input volume: What is beneficial for Convolutional Neural Network? [J].
Al-Saggaf, Ubaid M. ;
Botalb, Abdelaziz ;
Moinuddin, Muhammad ;
Alfakeh, Sulhi Ali ;
Ali, Syed Saad Azhar ;
Boon, Tang Tong .
2020 8TH INTERNATIONAL CONFERENCE ON INTELLIGENT AND ADVANCED SYSTEMS (ICIAS), 2021,
[2]  
Albelwi S, 2016, 2016 15TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2016), P53, DOI [10.1109/ICMLA.2016.0018, 10.1109/ICMLA.2016.46]
[3]  
[Anonymous], 2014, NIPS WORKSHOP BAYESI
[4]  
[Anonymous], 2007, IEEE INT C ICML
[5]  
Baker B, 2017, Arxiv, DOI arXiv:1611.02167
[6]  
Bergstra J., 2011, Adv. Neural Inf. Process. Syst., P2546
[7]  
Bergstra J, 2012, J MACH LEARN RES, V13, P281
[8]  
Bochinski E, 2017, IEEE IMAGE PROC, P3924
[9]  
Cardona-Escobar, 2017, CIARP 2017, P143
[10]   An effective algorithm for hyperparameter optimization of neural networks [J].
Diaz, G. I. ;
Fokoue-Nkoutche, A. ;
Nannicini, G. ;
Samulowitz, H. .
IBM JOURNAL OF RESEARCH AND DEVELOPMENT, 2017, 61 (4-5)