RRAM-Based Analog Approximate Computing

被引：99

作者：

Li, Boxun ^{[1
]}

Gu, Peng ^{[1
]}

Shan, Yi ^{[2
]}

Wang, Yu ^{[1
]}

Chen, Yiran ^{[3
]}

Yang, Huazhong ^{[1
]}

机构：

[1] Tsinghua Univ, Dept Elect Engn, Tsinghua Natl Lab Informat Sci & Technol, Beijing 100084, Peoples R China

[2] Baidu Inc, Baidu Res Inst Deep Learning, Beijing 100085, Peoples R China

[3] Univ Pittsburgh, Dept Elect & Comp Engn, Pittsburgh, PA 15261 USA

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2015年 / 34卷 / 12期

基金：

美国国家科学基金会; 中国国家自然科学基金;

关键词：

Approximate computing; neural network; power efficiency; resistive random-access memory (RRAM); NEURAL-NETWORKS; DEVICE; DESIGN; MEMORY;

D O I：

10.1109/TCAD.2015.2445741

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Approximate computing is a promising design paradigm for better performance and power efficiency. In this paper, we propose a power efficient framework for analog approximate computing with the emerging metal-oxide resistive switching random-access memory (RRAM) devices. A programmable RRAM-based approximate computing unit (RRAM-ACU) is introduced first to accelerate approximated computation, and an approximate computing framework with scalability is then proposed on top of the RRAM-ACU. In order to program the RRAM-ACU efficiently, we also present a detailed configuration flow, which includes a customized approximator training scheme, an approximator-parameter-to-RRAM-state mapping algorithm, and an RRAM state tuning scheme. Finally, the proposed RRAM-based computing framework is modeled at system level. A predictive compact model is developed to estimate the configuration overhead of RRAM-ACU and help explore the application scenarios of RRAM-based analog approximate computing. The simulation results on a set of diverse benchmarks demonstrate that, compared with a x86-64 CPU at 2 GHz, the RRAM-ACU is able to achieve 4.06-196.41x speedup and power efficiency of 24.59-567.98 GFLOPS/W with quality loss of 8.72% on average. And the implementation of hierarchical model and X application demonstrates that the proposed RRAM-based approximate computing framework can achieve >12.8x power efficiency than its pure digital implementation counterparts (CPU, graphics processing unit, and field-programmable gate arrays).

引用

页码：1905 / 1917

页数：13