Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators

被引：0

作者：

Azamat, Azat ^{[1
]}

Asim, Faaiz ^{[2
]}

Kim, Jintae ^{[3
]}

Lee, Jongeun ^{[2
]}

机构：

[1] Ulsan Natl Inst Sci & Technol, Dept Comp Sci & Engn, Ulsan 44919, South Korea

[2] Ulsan Natl Inst Sci & Technol, Dept Elect Engn, Ulsan 44919, South Korea

[3] Konkuk Univ, Dept Elect & Elect Engn, Seoul 143701, South Korea

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2023年 / 42卷 / 12期

关键词：

Quantization (signal); Hardware; Artificial neural networks; Convolutional neural networks; Training; Throughput; Costs; AC-DC power converters; Memristors; Analog-to-digital conversion (ADC); convolutional neural network (CNN); in-memory computing accelerator; memristor; quantization;

D O I：

10.1109/TCAD.2023.3294461

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

While resistive random-access memory (ReRAM) crossbar arrays have the potential to significantly accelerate deep neural network (DNN) training through fast and low-cost matrix-vector multiplication, peripheral circuits like analog-to-digital converters (ADCs) create a high overhead. These ADCs consume over half of the chip power and a considerable portion of the chip cost. To address this challenge, we propose advanced quantization techniques that can significantly reduce the ADC overhead of ReRAM crossbar arrays (RCAs). Our methodology interprets ADC as a quantization mechanism, allowing us to scale the range of ADC input optimally along with the weight parameters of a DNN, resulting in multiple-bit reduction in ADC precision. This approach reduces ADC size and power consumption by several times, and it is applicable to any DNN type (binarized or multibit) and any RCA size. Additionally, we propose ways to minimize the overhead of the digital scaler, which is an essential part of our scheme and sometimes required. Our experimental results using ResNet-18 on the ImageNet dataset demonstrate that our method can reduce the size of the ADC by 32 times compared to ISAAC with only a minimal accuracy loss degradation of 0.24%. We also present evaluation results in the presence of ReRAM nonideality (such as stuck-at fault).

引用

页码：4897 / 4908

页数：12

共 50 条

[41] Aggressive Fault Tolerance for Memristor Crossbar-Based Neural Network Accelerators by Operational Unit Level Weight Mapping
Xu, Yatou
Jin, Song
Wang, Yu
Qi, Yincheng
IEEE ACCESS, 2021, 9 : 102828 - 102834
[42] Pruning and quantization algorithm with applications in memristor-based convolutional neural network
Mei Guo
Yurui Sun
Yongliang Zhu
Mingqiao Han
Gang Dou
Shiping Wen
Cognitive Neurodynamics, 2024, 18 : 233 - 245
[43] Fully integer-based quantization for mobile convolutional neural network inference
Peng, Peng
You, Mingyu
Xu, Weisheng
Li, Jiaxin
NEUROCOMPUTING, 2021, 432 : 194 - 205
[44] A Prediction Model of the Sum of Container Based on Combined BP Neural Network and SVM
Ding, Min-jie
Zhang, Shao-zhong
Zhong, Hai-dong
Wu, Yao-hui
Zhang, Liang-bin
JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2019, 15 (02): : 305 - 319
[45] Compute-in-Memory-Based Neural Network Accelerators for Safety-Critical Systems: Worst-Case Scenarios and Protections
Yan, Zheyu
Hu, Xiaobo Sharon
Shi, Yiyu
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (08) : 2452 - 2464
[46] CoMN: Algorithm-Hardware Co-Design Platform for Nonvolatile Memory-Based Convolutional Neural Network Accelerators
Han, Lixia
Pan, Renjie
Zhou, Zheng
Lu, Hairuo
Chen, Yiyang
Yang, Haozhang
Huang, Peng
Sun, Guangyu
Liu, Xiaoyan
Kang, Jinfeng
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (07) : 2043 - 2056
[47] Quantization error-based regularization for hardware-aware neural network training
Hirose, Kazutoshi
Uematsu, Ryota
Ando, Kota
Ueyoshi, Kodai
Ikebe, Masayuki
Asai, Tetsuya
Motomura, Masato
Takamaeda-Yamazaki, Shinya
IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2018, 9 (04): : 453 - 465
[48] A Convolutional Neural Network-Based Quantization Method for Block Compressed Sensing of Images
Gong, Jiulu
Chen, Qunlin
Zhu, Wei
Wang, Zepeng
ENTROPY, 2024, 26 (06)
[49] Visual Loop Closure Detection Based on Lightweight Convolutional Neural Network and Product Quantization
Huang, Liang
Zhu, Maojing
Zhang, Mingming
PROCEEDINGS OF 2021 IEEE 12TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2021, : 122 - 126
[50] High-performance Convolutional Neural Network Accelerator Based on Systolic Arrays and Quantization
Li, Yufeng
Lu, Shengli
Luo, Jihe
Pang, Wei
Liu, Hao
2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 335 - 339

← 1 2 3 4 5 →