Fractional-order stochastic gradient descent method with momentum and energy for deep neural networks

被引:6
作者
Zhou, Xingwen [1 ,2 ]
You, Zhenghao [1 ]
Sun, Weiguo [1 ]
Zhao, Dongdong [1 ]
Yan, Shi [1 ]
机构
[1] Lanzhou Univ, Sch Informat Sci & Engn, 222 Tianshui South Rd, Lanzhou 730000, Gansu Province, Peoples R China
[2] Lanzhou Univ, Sch Nucl Sci & Technol, 222 Tianshui South Rd, Lanzhou 730000, Gansu Province, Peoples R China
基金
中国国家自然科学基金;
关键词
Stochastic gradient descent; Momentum fractional-order; Energy stability; Deep neural networks; Image classification;
D O I
10.1016/j.neunet.2024.106810
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a novel fractional-order stochastic gradient descent with momentum and energy (FOSGDME) approach is proposed. Specifically, to address the challenge of converging to areal extreme point encountered by the existing fractional gradient algorithms, a novel fractional-order stochastic gradient descent (FOSGD) method is presented by modifying the definition of the Caputo fractional-order derivative. A FOSGD with moment (FOSGDM) is established by incorporating momentum information to accelerate the convergence speed and accuracy further. In addition, to improve the robustness and accuracy, a FOSGD with moment and energy is established by further introducing energy formation. The extensive experimental results on the image classification CIFAR-10 dataset obtained with ResNet and DenseNet demonstrate that the proposed FOSGD, FOSGDM and FOSGDME algorithms are superior to the integer order optimization algorithms, and achieve state-of-the-art performance.
引用
收藏
页数:13
相关论文
共 37 条
[1]   Fractional-Order Deep Backpropagation Neural Network [J].
Bao, Chunhui ;
Pu, Yifei ;
Zhang, Yi .
COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2018, 2018
[2]  
Bottou L., 1995, Advances in Neural Information Processing Systems 7, P585
[3]   Study on fractional order gradient methods [J].
Chen, Yuquan ;
Gao, Qing ;
Wei, Yiheng ;
Wang, Yong .
APPLIED MATHEMATICS AND COMPUTATION, 2017, 314 :310-321
[4]   Linear fractional order controllers; A survey in the frequency domain [J].
Dastjerdi, Ali Ahmadi ;
Vinagre, Blas M. ;
Chen, YangQuan ;
HosseinNia, S. Hassan .
ANNUAL REVIEWS IN CONTROL, 2019, 47 :51-70
[5]  
Dubey S. R., 2022, IEEE Transactions on Artificial Intelligence
[6]   Learning smooth dendrite morphological neurons by stochastic gradient descent for pattern classification [J].
Gomez-Flores, Wilfrido ;
Sossa, Humberto .
NEURAL NETWORKS, 2023, 168 :665-676
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]   Multi-Adaptive Optimization for multi-task learning with deep neural networks [J].
Hervella, alvaro S. ;
Rouco, Jose ;
Novo, Jorge ;
Ortega, Marcos .
NEURAL NETWORKS, 2024, 170 :254-265
[9]   Hyperspectral Image Super-Resolution via Deep Spatiospectral Attention Convolutional Neural Networks [J].
Hu, Jin-Fan ;
Huang, Ting-Zhu ;
Deng, Liang-Jian ;
Jiang, Tai-Xiang ;
Vivone, Gemine ;
Chanussot, Jocelyn .
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (12) :7251-7265
[10]   Densely Connected Convolutional Networks [J].
Huang, Gao ;
Liu, Zhuang ;
van der Maaten, Laurens ;
Weinberger, Kilian Q. .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :2261-2269