Unbiased quasi-hyperbolic nesterov-gradient momentum-based optimizers for accelerating convergence

被引:2
|
作者
Cheng, Weiwei [1 ]
Yang, Xiaochun [1 ,2 ,3 ]
Wang, Bin [1 ,2 ,3 ]
Wang, Wei [4 ,5 ]
机构
[1] Northeastern Univ, Sch Comp Sci & Engn, Shenyang 110167, Liaoning, Peoples R China
[2] Natl Frontiers Sci Ctr Ind Intelligence & Syst Op, Shenyang, Peoples R China
[3] Northeastern Univ, Key Lab Data Analyt & Optimizat Smart Ind, Minist Educ, Shenyang, Peoples R China
[4] Hong Kong Univ Sci & Technol Guangzhou, Informat Hub, Guangzhou, Guangdong, Peoples R China
[5] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
基金
中国国家自然科学基金;
关键词
Optimizer; Momentum; Accelerate convergence; Unbiased;
D O I
10.1007/s11280-022-01086-3
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the training process of deep learning models, one of the important steps is to choose an appropriate optimizer that directly determines the final performance of the model. Choosing the appropriate direction and step size (i.e. learning rate) of parameter update are decisive factors for optimizers. Previous gradient descent optimizers could be oscillated and fail to converge to a minimum point because they are only sensitive to the current gradient. Momentum-Based Optimizers (MBOs) have been widely adopted recently since they can relieve oscillation to accelerate convergence by using the exponentially decaying average of gradients to fine-tune the direction. However, we find that most of the existing MBOs are biased and inconsistent with the local fastest descent direction resulting in a high number of iterations. To accelerate convergence, we propose an Unbiased strategy to adjust the descent direction of a variety of MBOs. We further propose an Unbiased Quasi-hyperbolic Nesterov-gradient strategy (UQN) by combining our Unbiased strategy with the existing Quasi-hyperbolic and Nesterov-gradient. It makes each update step move in the local fastest descent direction, predicts the future gradient to avoid crossing the minimum point, and reduces gradient variance. We extend our strategies to multiple MBOs and prove the convergence of our strategies. The main experimental results presented in this paper are based on popular neural network models and benchmark datasets. The experimental results demonstrate the effectiveness and universality of our proposed strategies.
引用
收藏
页码:1323 / 1344
页数:22
相关论文
共 9 条
  • [1] Unbiased quasi-hyperbolic nesterov-gradient momentum-based optimizers for accelerating convergence
    Weiwei Cheng
    Xiaochun Yang
    Bin Wang
    Wei Wang
    World Wide Web, 2023, 26 : 1323 - 1344
  • [2] Perturbation Initialization, Adam-Nesterov and Quasi-Hyperbolic Momentum for Adversarial Examples
    Zou J.-H.
    Duan Y.-X.
    Ren C.-L.
    Qiu J.-Y.
    Zhou X.-Y.
    Pan Z.-S.
    Tien Tzu Hsueh Pao/Acta Electronica Sinica, 2022, 50 (01): : 207 - 216
  • [3] Convergence of Momentum-Based Stochastic Gradient Descent
    Jin, Ruinan
    He, Xingkang
    2020 IEEE 16TH INTERNATIONAL CONFERENCE ON CONTROL & AUTOMATION (ICCA), 2020, : 779 - 784
  • [4] On the Global Optimum Convergence of Momentum-based Policy Gradient
    Ding, Yuhao
    Zhang, Junzi
    Lavaei, Javad
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 151, 2022, 151
  • [5] Federated Gradient Averaging for Multi-Site Training with Momentum-Based Optimizers
    Remedios, Samuel W.
    Butman, John A.
    Landman, Bennett A.
    Pham, Dzung L.
    DOMAIN ADAPTATION AND REPRESENTATION TRANSFER, AND DISTRIBUTED AND COLLABORATIVE LEARNING, DART 2020, DCL 2020, 2020, 12444 : 170 - 180
  • [6] An Adaptive Quasi-Hyperbolic Momentum Method Based on AdaGrad plus Strategy
    Wei, Hongxu
    Zhang, Xu
    Fang, Zhi
    2022 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, COMPUTER VISION AND MACHINE LEARNING (ICICML), 2022, : 649 - 654
  • [7] A robust multi-scale learning network with quasi-hyperbolic momentum-based Adam optimizer for bearing intelligent fault diagnosis under sample imbalance scenarios and strong noise environment
    Ye, Maoyou
    Yan, Xiaoan
    Chen, Ning
    Liu, Ying
    STRUCTURAL HEALTH MONITORING-AN INTERNATIONAL JOURNAL, 2024, 23 (03): : 1664 - 1686
  • [8] Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Nonconvex Stochastic Optimization: Nonasymptotic Performance Bounds and Momentum-Based Acceleration
    Gao, Xuefeng
    Gurbuzbalaban, Mert
    Zhu, Lingjiong
    OPERATIONS RESEARCH, 2021, : 2931 - 2947
  • [9] Global Convergence of Stochastic Gradient Hamiltonian Monte Carlo for Nonconvex Stochastic Optimization: Nonasymptotic Performance Bounds and Momentum-Based Acceleration
    Gao, Xuefeng
    Gürbüzbalaban, Mert
    Zhu, Lingjiong
    Operations Research, 2022, 70 (05) : 2931 - 2947