XGrad: Boosting Gradient-Based Optimizers With Weight Prediction

被引：2

作者：

Guan, Lei ^{[1
]}

Li, Dongsheng ^{[2
]}

Shi, Yanqi ^{[2
]}

Meng, Jian ^{[1
]}

机构：

[1] Natl Univ Def Technol, Dept Math, Changsha 410073, Hunan, Peoples R China

[2] Natl Univ Def Technol, Natl Key Lab Parallel & Distributed Comp, Changsha 410073, Hunan, Peoples R China

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2024年 / 46卷 / 10期

基金：

中国国家自然科学基金;

关键词：

Training; Artificial neural networks; Convergence; Computational modeling; Backpropagation; Proposals; Predictive models; deep learning; generalization; gradient-based; optimizer; weight prediction;

D O I：

10.1109/TPAMI.2024.3387399

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, we propose a general deep learning training framework XGrad which introduces weight prediction into the popular gradient-based optimizers to boost their convergence and generalization when training the deep neural network (DNN) models. In particular, ahead of each mini-batch training, the future weights are predicted according to the update rule of the used optimizer and are then applied to both the forward pass and backward propagation. In this way, during the whole training period, the optimizer always utilizes the gradients w.r.t. the future weights to update the DNN parameters, making the gradient-based optimizer achieve better convergence and generalization compared to the original optimizer without weight prediction. XGrad is rather straightforward to implement yet pretty effective in boosting the convergence of gradient-based optimizers and the accuracy of DNN models. Empirical results concerning five popular optimizers including SGD with momentum, Adam, AdamW, AdaBelief, and AdaM3 demonstrate the effectiveness of our proposal. The experimental results validate that XGrad can attain higher model accuracy than the baseline optimizers when training the DNN models.

引用

页码：6731 / 6747

页数：17

共 50 条

[41] The Barker proposal: Combining robustness and efficiency in gradient-based MCMC
Livingstone, Samuel
Zanella, Giacomo
JOURNAL OF THE ROYAL STATISTICAL SOCIETY SERIES B-STATISTICAL METHODOLOGY, 2022, 84 (02) : 496 - 523
[42] A Self-Care Prediction Model for Children with Disability Based on Genetic Algorithm and Extreme Gradient Boosting
Syafrudin, Muhammad
Alfian, Ganjar
Fitriyani, Norma Latif
Anshari, Muhammad
Hadibarata, Tony
Fatwanto, Agung
Rhee, Jongtae
MATHEMATICS, 2020, 8 (09)
[43] Correcting gradient-based interpretations of deep neural networks for genomics
Majdandzic, Antonio
Rajesh, Chandana
Koo, Peter K.
GENOME BIOLOGY, 2023, 24 (01)
[44] A Stochastic Gradient-Based Projection Algorithm for Distributed Constrained Optimization
Zhang, Keke
Gao, Shanfu
Chen, Yingjue
Zheng, Zuqing
Lu, Qingguo
NEURAL INFORMATION PROCESSING, ICONIP 2023, PT I, 2024, 14447 : 356 - 367
[45] Sparse Channel Estimation with Gradient-Based Algorithms: A comparative Study
Abd El-Moaty, Ahmed M.
Zerguine, Azzedine
2018 15TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS AND DEVICES (SSD), 2018, : 60 - 64
[46] Gradient-Based Aero-Stealth Optimization of a Simplified Aircraft
Thoulon, Charles
Roge, Gilbert
Pironneau, Olivier
FLUIDS, 2024, 9 (08)
[47] Learning Gradient-Based ICA by Neurally Estimating Mutual Information
Hlynsson, Hlynur David
Wiskott, Laurenz
ADVANCES IN ARTIFICIAL INTELLIGENCE, KI 2019, 2019, 11793 : 182 - 187
[48] Correcting gradient-based interpretations of deep neural networks for genomics
Antonio Majdandzic
Chandana Rajesh
Peter K. Koo
Genome Biology, 24
[49] On CNN Applied to Speech-to-Text-Comparative Analysis of Different Gradient Based Optimizers
Gaiceanu, Theodora
Pastravanu, Octavian
IEEE 15TH INTERNATIONAL SYMPOSIUM ON APPLIED COMPUTATIONAL INTELLIGENCE AND INFORMATICS (SACI 2021), 2021, : 85 - 90
[50] Application of Gradient Boosting in the Design of Fuzzy Rule-Based Regression Models
Zhang, Huimin
Hu, Xingchen
Zhu, Xiubin
Liu, Xinwang
Pedrycz, Witold
IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2024, 36 (11) : 5621 - 5632

← 1 2 3 4 5 →