AdaXod: a new adaptive and momental bound algorithm for training deep neural networks

被引:6
作者
Liu, Yuanxuan [1 ]
Li, Dequan [2 ]
机构
[1] Anhui Univ Sci & Technol, Sch Math & Big Data, Huainan 232001, Anhui, Peoples R China
[2] Anhui Univ Sci & Technol, Sch Artificial Intelligence, Huainan 232001, Anhui, Peoples R China
关键词
Adaptive algorithm; Deep neural network; Image classification; Adaptive and momental bound;
D O I
10.1007/s11227-023-05338-5
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Adaptive algorithms are widely used in deep learning because of their fast convergence. Among them, Adam is the most widely used algorithm. However, studies have shown that Adam's generalization ability is weak. AdaX is a variant of Adam, which introduces a novel second-order momentum, modifies the second-order moment of Adam, and has good generalization ability. However, these algorithms may fail to converge due to instability and extreme learning rates during training. In this paper, we propose a new adaptive and momental bound algorithm, called AdaXod, which characterizes of exponentially averaging the learning rate and is particularly useful for training deep neural networks. By setting an adaptively limited learning rate in the AdaX algorithm, the resultant AdaXod can effectively eliminate the problem of excessive learning rate in the later stage of neural networks training and thus results in stable training. We conduct extensive experiments on different datasets and verify the advantages of the AdaXod algorithm by comparing with other advanced adaptive optimization algorithms. AdaXod eliminates large learning rates during neural networks training and outperforms other optimizers, especially for some neural networks with complex structures, such as DenseNet.
引用
收藏
页码:17691 / 17715
页数:25
相关论文
共 36 条
[1]   Transfer learning for image classification using VGG19: Caltech-101 image data set [J].
Bansal, Monika ;
Kumar, Munish ;
Sachdeva, Monika ;
Mittal, Ajay .
JOURNAL OF AMBIENT INTELLIGENCE AND HUMANIZED COMPUTING, 2021, 14 (4) :3609-3620
[2]   A new deep convolutional neural network model for classifying breast cancer histopathological images and the hyperparameter optimisation of the proposed model [J].
Burcak, Kadir Can ;
Baykan, Omer Kaan ;
Uguz, Harun .
JOURNAL OF SUPERCOMPUTING, 2021, 77 (01) :973-989
[3]  
Ding J., 2019, ARXIV
[4]   Deep neural network-based fusion model for emotion recognition using visual data [J].
Do, Luu-Ngoc ;
Yang, Hyung-Jeong ;
Nguyen, Hai-Duong ;
Kim, Soo-Hyung ;
Lee, Guee-Sang ;
Na, In-Seop .
JOURNAL OF SUPERCOMPUTING, 2021, 77 (10) :10773-10790
[5]  
Duchi J, 2011, J MACH LEARN RES, V12, P2121
[6]  
Ghadimi E, 2015, 2015 EUROPEAN CONTROL CONFERENCE (ECC), P310, DOI 10.1109/ECC.2015.7330562
[7]   A fast adaptive algorithm for training deep neural networks [J].
Gui, Yangting ;
Li, Dequan ;
Fang, Runyue .
APPLIED INTELLIGENCE, 2023, 53 (04) :4099-4108
[8]  
HBrendan McMahan, 2010, ARXIV
[9]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[10]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269