Monolithically Integrated RRAM- and CMOS-Based In-Memory Computing Optimizations for Efficient Deep Learning

被引：70

作者：

Yin, Shihui ^{[1
]}

Kim, Yulhwa ^{[2
]}

Han, Xu ^{[1
]}

Barnaby, Hugh ^{[1
]}

Yu, Shimeng ^{[3
]}

Luo, Yandong ^{[3
]}

He, Wangxin ^{[1
]}

Sun, Xiaoyu ^{[3
]}

Kim, Jae-Joon ^{[2
]}

Seo, Jae-sun ^{[1
]}

机构：

[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA

[2] Pohang Univ Sci & Technol, Dept Creat IT Engn, Pohang, South Korea

[3] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA

来源：

IEEE MICRO | 2019年 / 39卷 / 06期

基金：

美国国家科学基金会; 新加坡国家研究基金会;

关键词：

Deep neural networks;

D O I：

10.1109/MM.2019.2943047

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Resistive RAM (RRAM) has been presented as a promising memory technology toward deep neural network (DNN) hardware design, with nonvolatility, high density, high ON/OFF ratio, and compatibility with logic process. However, prior RRAM works for DNNs have shown limitations on parallelism for in-memory computing, array efficiency with large peripheral circuits, multilevel analog operation, and demonstration of monolithic integration. In this article, we propose circuit-/device-level optimizations to improve the energy and density of RRAM-based in-memory computing architectures. We report experimental results based on prototype chip design of 128 x 64 RRAM arrays and CMOS peripheral circuits, where RRAM devices are monolithically integrated in a commercial 90-nm CMOS technology. We demonstrate the CMOS peripheral circuit optimization using input-splitting scheme and investigate the implication of higher low resistance state on energy efficiency and robustness. Employing the proposed techniques, we demonstrate RRAM-based in-memory computing with up to 116.0 TOPS/W energy efficiency and 84.2% CIFAR-10 accuracy. Furthermore, we investigate four-level programming with single RRAM device, and report the system-level performance and DNN accuracy results using circuit-level benchmark simulator NeuroSim.

引用

页码：54 / 63

页数：10

共 17 条

[1]

[Anonymous], 2019, ARXIV190907514

[2] NeuroSim: A Circuit-Level Macro Model for Benchmarking Neuro-Inspired Architectures in Online Learning [J].

Chen, Pai-Yu ;

Peng, Xiaochen ;

Yu, Shimeng .

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (12) :3067-3080

[3]

Ho CH, 2017, INT EL DEVICES MEET

[4]

Hubara I, 2016, ADV NEUR IN, V29

[5]

Jiang ZW, 2018, 2018 IEEE SYMPOSIUM ON VLSI TECHNOLOGY, P173, DOI 10.1109/VLSIT.2018.8510687

[6]

Kim Yulhwa., 2018, Proceedings of the International Symposium on Low Power Electronics and Design, P41

[7] DRISA: A DRAM-based Reconfigurable In-Situ Accelerator [J].

Li, Shuangchen ;

Niu, Dimin ;

Malladi, Krishna T. ;

Zheng, Hongzhong ;

Brennan, Bob ;

Xie, Yuan .

50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, :288-301

[8]

Mochida R, 2018, 2018 IEEE SYMPOSIUM ON VLSI TECHNOLOGY, P175, DOI 10.1109/VLSIT.2018.8510676

[9] Deep learning [J].

Rusk, Nicole .

NATURE METHODS, 2016, 13 (01) :35-35

[10] Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology [J].

Seshadri, Vivek ;

Lee, Donghyuk ;

Mullins, Thomas ;

Hassan, Hasan ;

Boroumand, Amirali ;

Kim, Jeremie ;

Kozuch, Michael A. ;

Mutlu, Onur ;

Gibbons, Phillip B. ;

Mowry, Todd C. .

50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, :273-287

← 1 2 →