NICE: Noise Injection and Clamping Estimation for Neural Network Quantization

被引:5
|
作者
Baskin, Chaim [1 ]
Zheltonozhkii, Evgenii [1 ]
Rozen, Tal [2 ]
Liss, Natan [2 ]
Chai, Yoav [3 ]
Schwartz, Eli [3 ]
Giryes, Raja [3 ]
Bronstein, Alexander M. [1 ]
Mendelson, Avi [1 ]
机构
[1] Technion, Dept Comp Sci, IL-3200003 Haifa, Israel
[2] Technion, Dept Elect Engn, IL-3200003 Haifa, Israel
[3] Tel Aviv Univ, Sch Elect Engn, IL-6997801 Tel Aviv, Israel
关键词
neural networks; low power; quantization; CNN architecture;
D O I
10.3390/math9172144
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
Convolutional Neural Networks (CNNs) are very popular in many fields including computer vision, speech recognition, natural language processing, etc. Though deep learning leads to groundbreaking performance in those domains, the networks used are very computationally demanding and are far from being able to perform in real-time applications even on a GPU, which is not power efficient and therefore does not suit low power systems such as mobile devices. To overcome this challenge, some solutions have been proposed for quantizing the weights and activations of these networks, which accelerate the runtime significantly. Yet, this acceleration comes at the cost of a larger error unless spatial adjustments are carried out. The method proposed in this work trains quantized neural networks by noise injection and a learned clamping, which improve accuracy. This leads to state-of-the-art results on various regression and classification tasks, e.g., ImageNet classification with architectures such as ResNet-18/34/50 with as low as 3 bit weights and activations. We implement the proposed solution on an FPGA to demonstrate its applicability for low-power real-time applications. The quantization code will become publicly available upon acceptance.
引用
收藏
页数:12
相关论文
共 50 条
  • [41] Neural Network based Modeling of Audible Noise for High Frequency Injection based Position Estimation for PM Synchronous Motors at Low and Zero speed
    Khan, Ahmad Arshan
    Mohammed, Osama
    2009 IEEE ELECTRIC SHIP TECHNOLOGIES SYMPOSIUM, 2009, : 119 - 122
  • [42] Improving the Post-Training Neural Network Quantization by Prepositive Feature Quantization
    Chu, Tianshu
    Yang, Zuopeng
    Huang, Xiaolin
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 3056 - 3060
  • [43] Estimation of quantization noise for adaptive-prediction lifting schemes
    Parrilli, Sara
    Cagnazzo, Marco
    Pesquet-Popescu, Beatrice
    2009 IEEE INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2009), 2009, : 449 - +
  • [44] A Neural Network Observer for Injection Rate Estimation in Common Rail Injectors with Nozzle Wear
    Hofmann, Oliver
    Kiener, Manuel
    Rixen, Daniel
    PROCEEDINGS OF DINAME 2017, 2019, : 277 - 289
  • [45] Denoising based on noise parameter estimation in speckled OCT images using neural network
    Avanaki, Mohammad R. N.
    Laissue, P. Philippe
    Podoleanu, Adrian G.
    Hojjat, Ali
    1ST CANTERBURY WORKSHOP ON OPTICAL COHERENCE TOMOGRAPHY AND ADAPTIVE OPTICS, 2008, 7139
  • [46] RECURRENT NEURAL NETWORK LANGUAGE MODEL TRAINING WITH NOISE CONTRASTIVE ESTIMATION FOR SPEECH RECOGNITION
    Chen, X.
    Liu, X.
    Gales, M. J. E.
    Woodland, P. C.
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5411 - 5415
  • [47] On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation
    He, Tianxing
    Zhang, Yu
    Droppo, Jasha
    Yu, Kai
    2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
  • [48] Learning Vector Quantization Neural Network Method for Network Intrusion Detection
    YANG Degang1
    2. Department of Mathematics and Computer Science
    3. Department of Modern Educational Technology
    4. Department of Mathematics
    Wuhan University Journal of Natural Sciences, 2007, (01) : 147 - 150
  • [49] ADAPTIVE LAYERWISE QUANTIZATION FOR DEEP NEURAL NETWORK COMPRESSION
    Zhu, Xiaotian
    Zhou, Wengang
    Li, Houqiang
    2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
  • [50] Quantization Aware Factorization for Deep Neural Network Compression
    Cherniuk, Daria
    Abukhovich, Stanislav
    Phan, Anh-Huy
    Oseledets, Ivan
    Cichocki, Andrzej
    Gusak, Julia
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 81 : 973 - 988