Stochastic Markov gradient descent and training low-bit neural networks

被引：0

作者：

Ashbrock, Jonathan ^{[1
]}

Powell, Alexander M. ^{[2
]}

机构：

[1] MITRE Corp, Mclean, VA 22102 USA

[2] Vanderbilt Univ, Dept Math, Nashville, TN 37240 USA

来源：

SAMPLING THEORY SIGNAL PROCESSING AND DATA ANALYSIS | 2021年 / 19卷 / 02期

关键词：

Neural networks; Quantization; Stochastic gradient descent; Stochastic Markov gradient descent; Low-memory training;

D O I：

10.1007/s43670-021-00015-1

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.

引用

页数：23

共 50 条

[31] Stochastic Diagonal Approximate Greatest Descent in Neural Networks
Tan, Hong Hui
Lim, King Hann
Harno, Hendra Gunawan
2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1895 - 1898
[32] Overparametrized Multi-layer Neural Networks: Uniform Concentration of Neural Tangent Kernel and Convergence of Stochastic Gradient Descent
Xu, Jiaming
Zhu, Hanjing
JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 83
[33] Differentially private stochastic gradient descent with low-noise
Wang, Puyu
Lei, Yunwen
Ying, Yiming
Zhou, Ding-Xuan
NEUROCOMPUTING, 2024, 587
[34] Adaptive Stochastic Conjugate Gradient Optimization for Backpropagation Neural Networks
Hashem, Ibrahim Abaker Targio
Alaba, Fadele Ayotunde
Jumare, Muhammad Haruna
Ibrahim, Ashraf Osman
Abulfaraj, Anas Waleed
IEEE ACCESS, 2024, 12 : 33757 - 33768
[35] Training neural networks by stochastic optimisation
Verikas, A
Gelzinis, A
NEUROCOMPUTING, 2000, 30 (1-4) : 153 - 172
[36] Mixed Precision Low-Bit Quantization of Neural Network Language Models for Speech Recognition
Xu, Junhao
Yu, Jianwei
Hu, Shoukang
Liu, Xunying
Meng, Helen
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3679 - 3693
[37] Fast gradient descent algorithm for image classification with neural networks
El Mouatasim, Abdelkrim
SIGNAL IMAGE AND VIDEO PROCESSING, 2020, 14 (08) : 1565 - 1572
[38] Convergence rates for shallow neural networks learned by gradient descent
Braun, Alina
Kohler, Michael
Langer, Sophie
Walk, Harro
BERNOULLI, 2024, 30 (01) : 475 - 502
[39] Fast gradient descent algorithm for image classification with neural networks
Abdelkrim El Mouatasim
Signal, Image and Video Processing, 2020, 14 : 1565 - 1572
[40] BRIDGING THE GAP BETWEEN CONSTANT STEP SIZE STOCHASTIC GRADIENT DESCENT AND MARKOV CHAINS
Dieuleveut, Aymeric
Durmus, Alain
Bach, Francis
ANNALS OF STATISTICS, 2020, 48 (03): : 1348 - 1382

← 1 2 3 4 5 →