NICE: Noise Injection and Clamping Estimation for Neural Network Quantization

被引：5

作者：

Baskin, Chaim ^{[1
]}

Zheltonozhkii, Evgenii ^{[1
]}

Rozen, Tal ^{[2
]}

Liss, Natan ^{[2
]}

Chai, Yoav ^{[3
]}

Schwartz, Eli ^{[3
]}

Giryes, Raja ^{[3
]}

Bronstein, Alexander M. ^{[1
]}

Mendelson, Avi ^{[1
]}

机构：

[1] Technion, Dept Comp Sci, IL-3200003 Haifa, Israel

[2] Technion, Dept Elect Engn, IL-3200003 Haifa, Israel

[3] Tel Aviv Univ, Sch Elect Engn, IL-6997801 Tel Aviv, Israel

来源：

MATHEMATICS | 2021年 / 9卷 / 17期

关键词：

neural networks; low power; quantization; CNN architecture;

D O I：

10.3390/math9172144

中图分类号：

O1 [数学];

学科分类号：

0701 ; 070101 ;

摘要：

Convolutional Neural Networks (CNNs) are very popular in many fields including computer vision, speech recognition, natural language processing, etc. Though deep learning leads to groundbreaking performance in those domains, the networks used are very computationally demanding and are far from being able to perform in real-time applications even on a GPU, which is not power efficient and therefore does not suit low power systems such as mobile devices. To overcome this challenge, some solutions have been proposed for quantizing the weights and activations of these networks, which accelerate the runtime significantly. Yet, this acceleration comes at the cost of a larger error unless spatial adjustments are carried out. The method proposed in this work trains quantized neural networks by noise injection and a learned clamping, which improve accuracy. This leads to state-of-the-art results on various regression and classification tasks, e.g., ImageNet classification with architectures such as ResNet-18/34/50 with as low as 3 bit weights and activations. We implement the proposed solution on an FPGA to demonstrate its applicability for low-power real-time applications. The quantization code will become publicly available upon acceptance.

引用

页数：12

共 50 条

[41] Neural Network based Modeling of Audible Noise for High Frequency Injection based Position Estimation for PM Synchronous Motors at Low and Zero speed
Khan, Ahmad Arshan
Mohammed, Osama
2009 IEEE ELECTRIC SHIP TECHNOLOGIES SYMPOSIUM, 2009, : 119 - 122
[42] Improving the Post-Training Neural Network Quantization by Prepositive Feature Quantization
Chu, Tianshu
Yang, Zuopeng
Huang, Xiaolin
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2024, 34 (04) : 3056 - 3060
[43] Estimation of quantization noise for adaptive-prediction lifting schemes
Parrilli, Sara
Cagnazzo, Marco
Pesquet-Popescu, Beatrice
2009 IEEE INTERNATIONAL WORKSHOP ON MULTIMEDIA SIGNAL PROCESSING (MMSP 2009), 2009, : 449 - +
[44] A Neural Network Observer for Injection Rate Estimation in Common Rail Injectors with Nozzle Wear
Hofmann, Oliver
Kiener, Manuel
Rixen, Daniel
PROCEEDINGS OF DINAME 2017, 2019, : 277 - 289
[45] Denoising based on noise parameter estimation in speckled OCT images using neural network
Avanaki, Mohammad R. N.
Laissue, P. Philippe
Podoleanu, Adrian G.
Hojjat, Ali
1ST CANTERBURY WORKSHOP ON OPTICAL COHERENCE TOMOGRAPHY AND ADAPTIVE OPTICS, 2008, 7139
[46] RECURRENT NEURAL NETWORK LANGUAGE MODEL TRAINING WITH NOISE CONTRASTIVE ESTIMATION FOR SPEECH RECOGNITION
Chen, X.
Liu, X.
Gales, M. J. E.
Woodland, P. C.
2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 5411 - 5415
[47] On Training Bi-directional Neural Network Language Model with Noise Contrastive Estimation
He, Tianxing
Zhang, Yu
Droppo, Jasha
Yu, Kai
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2016,
[48] Learning Vector Quantization Neural Network Method for Network Intrusion Detection
YANG Degang1
2. Department of Mathematics and Computer Science
3. Department of Modern Educational Technology
4. Department of Mathematics
Wuhan University Journal of Natural Sciences, 2007, (01) : 147 - 150
[49] ADAPTIVE LAYERWISE QUANTIZATION FOR DEEP NEURAL NETWORK COMPRESSION
Zhu, Xiaotian
Zhou, Wengang
Li, Houqiang
2018 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2018,
[50] Quantization Aware Factorization for Deep Neural Network Compression
Cherniuk, Daria
Abukhovich, Stanislav
Phan, Anh-Huy
Oseledets, Ivan
Cichocki, Andrzej
Gusak, Julia
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 81 : 973 - 988

← 1 2 3 4 5 →