Partial Sum Quantization for Reducing ADC Size in ReRAM-Based Neural Network Accelerators

被引:0
作者
Azamat, Azat [1 ]
Asim, Faaiz [2 ]
Kim, Jintae [3 ]
Lee, Jongeun [2 ]
机构
[1] Ulsan Natl Inst Sci & Technol, Dept Comp Sci & Engn, Ulsan 44919, South Korea
[2] Ulsan Natl Inst Sci & Technol, Dept Elect Engn, Ulsan 44919, South Korea
[3] Konkuk Univ, Dept Elect & Elect Engn, Seoul 143701, South Korea
关键词
Quantization (signal); Hardware; Artificial neural networks; Convolutional neural networks; Training; Throughput; Costs; AC-DC power converters; Memristors; Analog-to-digital conversion (ADC); convolutional neural network (CNN); in-memory computing accelerator; memristor; quantization;
D O I
10.1109/TCAD.2023.3294461
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
While resistive random-access memory (ReRAM) crossbar arrays have the potential to significantly accelerate deep neural network (DNN) training through fast and low-cost matrix-vector multiplication, peripheral circuits like analog-to-digital converters (ADCs) create a high overhead. These ADCs consume over half of the chip power and a considerable portion of the chip cost. To address this challenge, we propose advanced quantization techniques that can significantly reduce the ADC overhead of ReRAM crossbar arrays (RCAs). Our methodology interprets ADC as a quantization mechanism, allowing us to scale the range of ADC input optimally along with the weight parameters of a DNN, resulting in multiple-bit reduction in ADC precision. This approach reduces ADC size and power consumption by several times, and it is applicable to any DNN type (binarized or multibit) and any RCA size. Additionally, we propose ways to minimize the overhead of the digital scaler, which is an essential part of our scheme and sometimes required. Our experimental results using ResNet-18 on the ImageNet dataset demonstrate that our method can reduce the size of the ADC by 32 times compared to ISAAC with only a minimal accuracy loss degradation of 0.24%. We also present evaluation results in the presence of ReRAM nonideality (such as stuck-at fault).
引用
收藏
页码:4897 / 4908
页数:12
相关论文
共 50 条
  • [41] Aggressive Fault Tolerance for Memristor Crossbar-Based Neural Network Accelerators by Operational Unit Level Weight Mapping
    Xu, Yatou
    Jin, Song
    Wang, Yu
    Qi, Yincheng
    IEEE ACCESS, 2021, 9 : 102828 - 102834
  • [42] Pruning and quantization algorithm with applications in memristor-based convolutional neural network
    Mei Guo
    Yurui Sun
    Yongliang Zhu
    Mingqiao Han
    Gang Dou
    Shiping Wen
    Cognitive Neurodynamics, 2024, 18 : 233 - 245
  • [43] Fully integer-based quantization for mobile convolutional neural network inference
    Peng, Peng
    You, Mingyu
    Xu, Weisheng
    Li, Jiaxin
    NEUROCOMPUTING, 2021, 432 : 194 - 205
  • [44] A Prediction Model of the Sum of Container Based on Combined BP Neural Network and SVM
    Ding, Min-jie
    Zhang, Shao-zhong
    Zhong, Hai-dong
    Wu, Yao-hui
    Zhang, Liang-bin
    JOURNAL OF INFORMATION PROCESSING SYSTEMS, 2019, 15 (02): : 305 - 319
  • [45] Compute-in-Memory-Based Neural Network Accelerators for Safety-Critical Systems: Worst-Case Scenarios and Protections
    Yan, Zheyu
    Hu, Xiaobo Sharon
    Shi, Yiyu
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (08) : 2452 - 2464
  • [46] CoMN: Algorithm-Hardware Co-Design Platform for Nonvolatile Memory-Based Convolutional Neural Network Accelerators
    Han, Lixia
    Pan, Renjie
    Zhou, Zheng
    Lu, Hairuo
    Chen, Yiyang
    Yang, Haozhang
    Huang, Peng
    Sun, Guangyu
    Liu, Xiaoyan
    Kang, Jinfeng
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2024, 43 (07) : 2043 - 2056
  • [47] Quantization error-based regularization for hardware-aware neural network training
    Hirose, Kazutoshi
    Uematsu, Ryota
    Ando, Kota
    Ueyoshi, Kodai
    Ikebe, Masayuki
    Asai, Tetsuya
    Motomura, Masato
    Takamaeda-Yamazaki, Shinya
    IEICE NONLINEAR THEORY AND ITS APPLICATIONS, 2018, 9 (04): : 453 - 465
  • [48] A Convolutional Neural Network-Based Quantization Method for Block Compressed Sensing of Images
    Gong, Jiulu
    Chen, Qunlin
    Zhu, Wei
    Wang, Zepeng
    ENTROPY, 2024, 26 (06)
  • [49] Visual Loop Closure Detection Based on Lightweight Convolutional Neural Network and Product Quantization
    Huang, Liang
    Zhu, Maojing
    Zhang, Mingming
    PROCEEDINGS OF 2021 IEEE 12TH INTERNATIONAL CONFERENCE ON SOFTWARE ENGINEERING AND SERVICE SCIENCE (ICSESS), 2021, : 122 - 126
  • [50] High-performance Convolutional Neural Network Accelerator Based on Systolic Arrays and Quantization
    Li, Yufeng
    Lu, Shengli
    Luo, Jihe
    Pang, Wei
    Liu, Hao
    2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019), 2019, : 335 - 339