An Energy-Efficient Inference Engine for a Configurable ReRAM-Based Neural Network Accelerator

被引:4
作者
Zheng, Yang-Lin [1 ]
Yang, Wei-Yi [1 ]
Chen, Ya-Shu [1 ]
Han, Ding-Hung [1 ]
机构
[1] Natl Taiwan Univ Sci & Technol, Dept Elect Engn, Taipei 10607, Taiwan
关键词
Voltage; Artificial neural networks; Power demand; Energy efficiency; Virtual machine monitors; Energy consumption; Engines; Analog-to-digital circuit (ADC); neural network (NN) acceleration; processing-in-memory (PIM); resistive random-access memory (ReRAM) crossbar array; RRAM CROSSBAR ARRAY;
D O I
10.1109/TCAD.2022.3184464
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Resistive random-access memory (ReRAM) offers a potential solution to accelerate the inference of deep neural networks by performing processing-in-memory. However, the peripheral circuits of ReRAM crossbars used to perform arithmetic operations consume significant amounts of power. Based on a power consumption analysis of the ReRAM crossbar circuits, we propose using the dynamic reference voltage scalable analog-to-digital circuits (ADCs) to conduct the dot product operation to enable the reconfigurability of the ReRAM-based neural network (NN) accelerator while maintaining accuracy. We propose a configurable ReRAM-based NN accelerator to provide various degrees of computing granularity with different levels of power consumption, creating a tradeoff between performance and power consumption in the given NN. Next, we develop an energy-efficient inference engine for the configurable ReRAM-based NN accelerator, EIF, to assign the operation unit (OU) size to perform vector-matrix multiplication (VMM) based on the data dependence of the NN. Our evaluation shows that the proposed EIF provided an energy savings of up to 36% over the state-of-the-art ReRAM-based accelerator while maintaining performance without resource duplication.
引用
收藏
页码:740 / 753
页数:14
相关论文
共 46 条
  • [1] Binkert Nathan, 2011, Computer Architecture News, V39, P1, DOI 10.1145/2024716.2024718
  • [2] LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory
    Boroumand, Amirali
    Ghose, Saugata
    Patel, Minesh
    Hassan, Hasan
    Lucia, Brandon
    Hsieh, Kevin
    Malladi, Krishna T.
    Zheng, Hongzhong
    Mutlu, Onur
    [J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 46 - 50
  • [3] Chen WH, 2018, ISSCC DIG TECH PAP I, P494, DOI 10.1109/ISSCC.2018.8310400
  • [4] Chen XZ, 2018, ASIA S PACIF DES AUT, P123, DOI 10.1109/ASPDAC.2018.8297293
  • [5] PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
    Chi, Ping
    Li, Shuangchen
    Xu, Cong
    Zhang, Tao
    Zhao, Jishen
    Liu, Yongpan
    Wang, Yu
    Xie, Yuan
    [J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 27 - 39
  • [6] CASCADE: Connecting RRAMs to Extend Analog Dataflow In An End-To-End In-Memory Processing Paradigm
    Chou, Teyuh
    Tang, Wei
    Botimer, Jacob
    Zhang, Zhengya
    [J]. MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 114 - 125
  • [7] de Benito C., 2017, PROC INT S POWER TIM, P1
  • [8] Deng H, 2020, MIDWEST SYMP CIRCUIT, P265, DOI [10.1109/MWSCAS48704.2020.9184523, 10.1109/mwscas48704.2020.9184523]
  • [9] Dumoulin V, 2018, Arxiv, DOI [arXiv:1603.07285, 10.48550/arXiv.1603.07285]
  • [10] Hamouche L., 2011, THESIS I NATL SCI AP