Stochastic Gradient Descent Method of Convolutional Neural Network Using Fractional-Order Momentum

被引：0

作者：

Kan T. ^{[1
]}

Gao Z. ^{[1
,2
]}

Yang C. ^{[1
]}

机构：

[1] School of Mathematics, Liaoning University, Shenyang

[2] College of Light Industry, Liaoning University, Shenyang

来源：

Moshi Shibie yu Rengong Zhineng/Pattern Recognition and Artificial Intelligence | 2020年 / 33卷 / 06期

基金：

中国博士后科学基金;

关键词：

Convolutional Neural Network; Fractional-Order Difference; Stochastic Gradient Descent;

D O I：

10.16451/j.cnki.issn1003-6059.202006009

中图分类号：

学科分类号：

摘要：

The stochastic gradient descent method may converge to a local optimum. Aiming at this problem, a stochastic gradient descent method of convolutional neural network using fractional-order momentum is proposed to improve recognition accuracy and learning convergence rate of convolution neural networks. By combining the traditional momentum-based stochastic gradient descent method with fractional-order difference method, the parameter updating method is improved. The influence of fractional-order on the training result of network parameters is discussed, and an order adjustment method is produced. The validity of the proposed parameters training method is verified and analyzed on MNIST dataset and CIFAR-10 dataset. The experimental results show that the proposed method improves the recognition accuracy and learning convergence rate of convolutional neural networks. © 2020, Science Press. All right reserved.

引用

页码：559 / 567

页数：8

共 22 条

[1] LECUN Y, BOSER B, DENKER J S, Et al., Backpropagation Applied to Handwritten Zip Code Recognition, Neural Computation, 1, 4, pp. 541-551, (1989)
[2] WANG Y N, SU J B., Multipose Face Image Recognition Based on Image Synthesis, Pattern Recognition and Artificial Intelligence, 28, 9, pp. 848-856, (2015)
[3] ZHANG Z L, ZHAO J W, CAO F L., Building Deep Neural Networks with Dilated Convolutions to Reconstruct High-Resolution Image, Pattern Recognition and Artificial Intelligence, 32, 3, pp. 259-267, (2019)
[4] TIELEMAN T, HINTON G., Lecture 6.5-rmsProp: Divide the Gradient by a Running Average of Its Recent Magnitude, COURSERA: Neural Networks for Machine Learning, 4, 2, pp. 26-30, (2012)
[5] DUCHI J, HAZAN E, SINGER Y., Adaptive Subgradient Methods for Online Learning and Stochastic Optimization, Journal of Machine Learning Research, 12, pp. 2121-2159, (2011)
[6] ZEILER M D., ADADELTA: An Adaptive Learning Rate Method
[7] DIEDERIK P K, JIMMY B., ADAM: A Method for Stochastic Optimization
[8] REDDI S J, KALE S, KUMAR S., On the Convergence of Adam and Beyond
[9] QIAN N., On the Momentum Term in Gradient Descent Learning Algorithms, Neural Networks, 12, 1, pp. 145-151, (1999)
[10] SHENG D, WEI Y H, CHEN Y Q, Et al., Convolutional Neural Networks with Fractional Order Gradient Method[C/OL]

← 1 2 3 →