Stochastic Markov gradient descent and training low-bit neural networks

被引：0

作者：

Ashbrock, Jonathan ^{[1
]}

Powell, Alexander M. ^{[2
]}

机构：

[1] MITRE Corp, Mclean, VA 22102 USA

[2] Vanderbilt Univ, Dept Math, Nashville, TN 37240 USA

来源：

SAMPLING THEORY SIGNAL PROCESSING AND DATA ANALYSIS | 2021年 / 19卷 / 02期

关键词：

Neural networks; Quantization; Stochastic gradient descent; Stochastic Markov gradient descent; Low-memory training;

D O I：

10.1007/s43670-021-00015-1

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.

引用

页数：23

共 50 条

[21] Parameter calibration with stochastic gradient descent for interacting particle systems driven by neural networks
Simone Göttlich
Claudia Totzeck
Mathematics of Control, Signals, and Systems, 2022, 34 : 185 - 214
[22] Convergence of Stochastic Gradient Descent in Deep Neural Network
Bai-cun Zhou
Cong-ying Han
Tian-de Guo
Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136
[23] Convergence of Stochastic Gradient Descent in Deep Neural Network
Zhou, Bai-cun
Han, Cong-ying
Guo, Tian-de
ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136
[24] Rethinking the Importance of Quantization Bias, Toward Full Low-Bit Training
Liu, Chang
Zhang, Xishan
Zhang, Rui
Li, Ling
Zhou, Shiyi
Huang, Di
Li, Zhen
Du, Zidong
Liu, Shaoli
Chen, Tianshi
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7006 - 7019
[25] Theoretical analysis of batch and on-line training for gradient descent learning in neural networks
Nakama, Takehiko
NEUROCOMPUTING, 2009, 73 (1-3) : 151 - 159
[26] Fractional-order stochastic gradient descent method with momentum and energy for deep neural networks
Zhou, Xingwen
You, Zhenghao
Sun, Weiguo
Zhao, Dongdong
Yan, Shi
NEURAL NETWORKS, 2025, 181
[27] An automatic learning rate decay strategy for stochastic gradient descent optimization methods in neural networks
Wang, Kang
Dou, Yong
Sun, Tao
Qiao, Peng
Wen, Dong
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (10) : 7334 - 7355
[28] Efficient Optimization of Neural Networks for Predictive Hiring: An In-Depth Approach to Stochastic Gradient Descent
Temsamani, Yassine Khallouk
Achchab, Said
2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 588 - 594
[29] STADIA: Photonic Stochastic Gradient Descent for Neural Network Accelerators
Xia, Chengpeng
Chen, Yawen
Zhang, Haibo
Wu, Jigang
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (05)
[30] Robust decentralized stochastic gradient descent over unstable networks
Zheng, Yanwei
Zhang, Liangxu
Chen, Shuzhen
Zhang, Xiao
Cai, Zhipeng
Cheng, Xiuzhen
COMPUTER COMMUNICATIONS, 2023, 203 : 163 - 179

← 1 2 3 4 5 →