Stochastic Markov gradient descent and training low-bit neural networks

被引:0
|
作者
Ashbrock, Jonathan [1 ]
Powell, Alexander M. [2 ]
机构
[1] MITRE Corp, Mclean, VA 22102 USA
[2] Vanderbilt Univ, Dept Math, Nashville, TN 37240 USA
来源
SAMPLING THEORY SIGNAL PROCESSING AND DATA ANALYSIS | 2021年 / 19卷 / 02期
关键词
Neural networks; Quantization; Stochastic gradient descent; Stochastic Markov gradient descent; Low-memory training;
D O I
10.1007/s43670-021-00015-1
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.
引用
收藏
页数:23
相关论文
共 50 条
  • [31] Stochastic Diagonal Approximate Greatest Descent in Neural Networks
    Tan, Hong Hui
    Lim, King Hann
    Harno, Hendra Gunawan
    2017 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2017, : 1895 - 1898
  • [32] Overparametrized Multi-layer Neural Networks: Uniform Concentration of Neural Tangent Kernel and Convergence of Stochastic Gradient Descent
    Xu, Jiaming
    Zhu, Hanjing
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25 : 1 - 83
  • [33] Differentially private stochastic gradient descent with low-noise
    Wang, Puyu
    Lei, Yunwen
    Ying, Yiming
    Zhou, Ding-Xuan
    NEUROCOMPUTING, 2024, 587
  • [34] Adaptive Stochastic Conjugate Gradient Optimization for Backpropagation Neural Networks
    Hashem, Ibrahim Abaker Targio
    Alaba, Fadele Ayotunde
    Jumare, Muhammad Haruna
    Ibrahim, Ashraf Osman
    Abulfaraj, Anas Waleed
    IEEE ACCESS, 2024, 12 : 33757 - 33768
  • [35] Training neural networks by stochastic optimisation
    Verikas, A
    Gelzinis, A
    NEUROCOMPUTING, 2000, 30 (1-4) : 153 - 172
  • [36] Mixed Precision Low-Bit Quantization of Neural Network Language Models for Speech Recognition
    Xu, Junhao
    Yu, Jianwei
    Hu, Shoukang
    Liu, Xunying
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3679 - 3693
  • [38] Convergence rates for shallow neural networks learned by gradient descent
    Braun, Alina
    Kohler, Michael
    Langer, Sophie
    Walk, Harro
    BERNOULLI, 2024, 30 (01) : 475 - 502
  • [39] Fast gradient descent algorithm for image classification with neural networks
    Abdelkrim El Mouatasim
    Signal, Image and Video Processing, 2020, 14 : 1565 - 1572
  • [40] BRIDGING THE GAP BETWEEN CONSTANT STEP SIZE STOCHASTIC GRADIENT DESCENT AND MARKOV CHAINS
    Dieuleveut, Aymeric
    Durmus, Alain
    Bach, Francis
    ANNALS OF STATISTICS, 2020, 48 (03): : 1348 - 1382