An Energy-Efficient Inference Engine for a Configurable ReRAM-Based Neural Network Accelerator

被引：4

作者：

Zheng, Yang-Lin ^{[1
]}

Yang, Wei-Yi ^{[1
]}

Chen, Ya-Shu ^{[1
]}

Han, Ding-Hung ^{[1
]}

机构：

[1] Natl Taiwan Univ Sci & Technol, Dept Elect Engn, Taipei 10607, Taiwan

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2023年 / 42卷 / 03期

关键词：

Voltage; Artificial neural networks; Power demand; Energy efficiency; Virtual machine monitors; Energy consumption; Engines; Analog-to-digital circuit (ADC); neural network (NN) acceleration; processing-in-memory (PIM); resistive random-access memory (ReRAM) crossbar array; RRAM CROSSBAR ARRAY;

D O I：

10.1109/TCAD.2022.3184464

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Resistive random-access memory (ReRAM) offers a potential solution to accelerate the inference of deep neural networks by performing processing-in-memory. However, the peripheral circuits of ReRAM crossbars used to perform arithmetic operations consume significant amounts of power. Based on a power consumption analysis of the ReRAM crossbar circuits, we propose using the dynamic reference voltage scalable analog-to-digital circuits (ADCs) to conduct the dot product operation to enable the reconfigurability of the ReRAM-based neural network (NN) accelerator while maintaining accuracy. We propose a configurable ReRAM-based NN accelerator to provide various degrees of computing granularity with different levels of power consumption, creating a tradeoff between performance and power consumption in the given NN. Next, we develop an energy-efficient inference engine for the configurable ReRAM-based NN accelerator, EIF, to assign the operation unit (OU) size to perform vector-matrix multiplication (VMM) based on the data dependence of the NN. Our evaluation shows that the proposed EIF provided an energy savings of up to 36% over the state-of-the-art ReRAM-based accelerator while maintaining performance without resource duplication.

引用

页码：740 / 753

页数：14

共 46 条

[1] Binkert Nathan, 2011, Computer Architecture News, V39, P1, DOI 10.1145/2024716.2024718
[2] LazyPIM: An Efficient Cache Coherence Mechanism for Processing-in-Memory
Boroumand, Amirali
Ghose, Saugata
Patel, Minesh
Hassan, Hasan
Lucia, Brandon
Hsieh, Kevin
Malladi, Krishna T.
Zheng, Hongzhong
Mutlu, Onur
[J]. IEEE COMPUTER ARCHITECTURE LETTERS, 2017, 16 (01) : 46 - 50
[3] Chen WH, 2018, ISSCC DIG TECH PAP I, P494, DOI 10.1109/ISSCC.2018.8310400
[4] Chen XZ, 2018, ASIA S PACIF DES AUT, P123, DOI 10.1109/ASPDAC.2018.8297293
[5] PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
Chi, Ping
Li, Shuangchen
Xu, Cong
Zhang, Tao
Zhao, Jishen
Liu, Yongpan
Wang, Yu
Xie, Yuan
[J]. 2016 ACM/IEEE 43RD ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2016, : 27 - 39
[6] CASCADE: Connecting RRAMs to Extend Analog Dataflow In An End-To-End In-Memory Processing Paradigm
Chou, Teyuh
Tang, Wei
Botimer, Jacob
Zhang, Zhengya
[J]. MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, : 114 - 125
[7] de Benito C., 2017, PROC INT S POWER TIM, P1
[8] Deng H, 2020, MIDWEST SYMP CIRCUIT, P265, DOI [10.1109/MWSCAS48704.2020.9184523, 10.1109/mwscas48704.2020.9184523]
[9] Dumoulin V, 2018, Arxiv, DOI [arXiv:1603.07285, 10.48550/arXiv.1603.07285]
[10] Hamouche L., 2011, THESIS I NATL SCI AP

← 1 2 3 4 5 →