Non-smooth Bayesian learning for artificial neural networks

被引:3
作者
Fakhfakh M. [1 ,2 ]
Chaari L. [2 ]
Bouaziz B. [1 ]
Gargouri F. [1 ]
机构
[1] MIRACL laboratory, University of Sfax, Sfax
[2] University of Toulouse, INP, IRIT, Toulouse
关键词
Artificial neural networks; Hamiltonian dynamics; Machine learning; Optimization;
D O I
10.1007/s12652-022-04073-8
中图分类号
学科分类号
摘要
Artificial neural networks (ANNs) are being widely used in supervised machine learning to analyze signals or images for many applications. Using an annotated learning database, one of the main challenges is to optimize the network weights. A lot of work on solving optimization problems or improving optimization methods in machine learning has been proposed successively such as gradient-based method, Newton-type method, meta-heuristic method. For the sake of efficiency, regularization is generally used. When non-smooth regularizers are used especially to promote sparse networks, such as the ℓ1 norm, this optimization becomes challenging due to non-differentiability issues of the target criterion. In this paper, we propose an MCMC-based optimization scheme formulated in a Bayesian framework. The proposed scheme solves the above-mentioned sparse optimization problem using an efficient sampling scheme and Hamiltonian dynamics. The designed optimizer is conducted on four (4) datasets, and the results are verified by a comparative study with two CNNs. Promising results show the usefulness of the proposed method to allow ANNs, even with low complexity levels, reaching high accuracy rates of up to 94 %. The proposed method is also faster and more robust concerning overfitting issues. More importantly, the training step of the proposed method is much faster than all competing algorithms. © 2022, The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature.
引用
收藏
页码:13813 / 13831
页数:18
相关论文
共 85 条
[1]  
Abualigah L., Diabat A., Mirjalili S., Abd Elaziz M., Gandomi A.H., The arithmetic optimization algorithm, Comput Methods Appl Mech Eng, 376, (2021)
[2]  
Alder B.J., Wainwright T.E., Studies in molecular dynamics I. General method, J Chem Phys, 31, 2, pp. 459-466, (1959)
[3]  
Alsarhan A., Alauthman M., Alshdaifat E., Al-Ghuwairi A.-R., Al-Dubai A., Machine learning-driven optimization for svm-based intrusion detection system in vehicular ad hoc networks, J Ambient Intell Humaniz Comput Accesses, 557, pp. 1-10, (2021)
[4]  
Angelov P., Almeida Soares E., Sars-cov-2 ct-scan dataset: a large dataset of real patients ct scans for sars-cov-2 identification., medRxiv 1, (2020)
[5]  
Anwar S., Hwang K., Sung W., Structured pruning of deep convolutional neural networks, ACM J Emerg Technol Comput Syst (JETC), 13, 3, pp. 1-18, (2017)
[6]  
Ashwini R., Shital R., Deep neural network regularization for feature selection in learning-to-rank, IEEE Access, 7, pp. 53988-54006, (2019)
[7]  
Avriel M., Nonlinear programming: analysis and methods, (2003)
[8]  
Berahas A.S., Byrd R.H., Nocedal J., Derivative-free optimization of noisy functions via quasi-newton methods, SIAM J Optim, 29, 2, pp. 965-993, (2019)
[9]  
Bollapragada R., Byrd R.H., Nocedal J., Exact and inexact subsampled newton methods for optimization, IMA J Numer Anal, 39, 2, pp. 545-578, (2019)
[10]  
Bottou L., Curtis F.E., Nocedal J., Optimization methods for large-scale machine learning, Siam Rev, 60, 2, pp. 223-311, (2018)