Stochastic Markov gradient descent and training low-bit neural networks

被引:0
|
作者
Ashbrock, Jonathan [1 ]
Powell, Alexander M. [2 ]
机构
[1] MITRE Corp, Mclean, VA 22102 USA
[2] Vanderbilt Univ, Dept Math, Nashville, TN 37240 USA
来源
SAMPLING THEORY SIGNAL PROCESSING AND DATA ANALYSIS | 2021年 / 19卷 / 02期
关键词
Neural networks; Quantization; Stochastic gradient descent; Stochastic Markov gradient descent; Low-memory training;
D O I
10.1007/s43670-021-00015-1
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
The massive size of modern neural networks has motivated substantial recent interest in neural network quantization, especially low-bit quantization. We introduce Stochastic Markov Gradient Descent (SMGD), a discrete optimization method applicable to training quantized neural networks. The SMGD algorithm is designed for settings where memory is highly constrained during training. We provide theoretical guarantees of algorithm performance as well as encouraging numerical results.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Parameter calibration with stochastic gradient descent for interacting particle systems driven by neural networks
    Simone Göttlich
    Claudia Totzeck
    Mathematics of Control, Signals, and Systems, 2022, 34 : 185 - 214
  • [22] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Bai-cun Zhou
    Cong-ying Han
    Tian-de Guo
    Acta Mathematicae Applicatae Sinica, English Series, 2021, 37 : 126 - 136
  • [23] Convergence of Stochastic Gradient Descent in Deep Neural Network
    Zhou, Bai-cun
    Han, Cong-ying
    Guo, Tian-de
    ACTA MATHEMATICAE APPLICATAE SINICA-ENGLISH SERIES, 2021, 37 (01): : 126 - 136
  • [24] Rethinking the Importance of Quantization Bias, Toward Full Low-Bit Training
    Liu, Chang
    Zhang, Xishan
    Zhang, Rui
    Li, Ling
    Zhou, Shiyi
    Huang, Di
    Li, Zhen
    Du, Zidong
    Liu, Shaoli
    Chen, Tianshi
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 7006 - 7019
  • [25] Theoretical analysis of batch and on-line training for gradient descent learning in neural networks
    Nakama, Takehiko
    NEUROCOMPUTING, 2009, 73 (1-3) : 151 - 159
  • [26] Fractional-order stochastic gradient descent method with momentum and energy for deep neural networks
    Zhou, Xingwen
    You, Zhenghao
    Sun, Weiguo
    Zhao, Dongdong
    Yan, Shi
    NEURAL NETWORKS, 2025, 181
  • [27] An automatic learning rate decay strategy for stochastic gradient descent optimization methods in neural networks
    Wang, Kang
    Dou, Yong
    Sun, Tao
    Qiao, Peng
    Wen, Dong
    INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2022, 37 (10) : 7334 - 7355
  • [28] Efficient Optimization of Neural Networks for Predictive Hiring: An In-Depth Approach to Stochastic Gradient Descent
    Temsamani, Yassine Khallouk
    Achchab, Said
    2024 5TH INTERNATIONAL CONFERENCE ON COMPUTING, NETWORKS AND INTERNET OF THINGS, CNIOT 2024, 2024, : 588 - 594
  • [29] STADIA: Photonic Stochastic Gradient Descent for Neural Network Accelerators
    Xia, Chengpeng
    Chen, Yawen
    Zhang, Haibo
    Wu, Jigang
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (05)
  • [30] Robust decentralized stochastic gradient descent over unstable networks
    Zheng, Yanwei
    Zhang, Liangxu
    Chen, Shuzhen
    Zhang, Xiao
    Cai, Zhipeng
    Cheng, Xiuzhen
    COMPUTER COMMUNICATIONS, 2023, 203 : 163 - 179