Monolithically Integrated RRAM- and CMOS-Based In-Memory Computing Optimizations for Efficient Deep Learning

被引:70
作者
Yin, Shihui [1 ]
Kim, Yulhwa [2 ]
Han, Xu [1 ]
Barnaby, Hugh [1 ]
Yu, Shimeng [3 ]
Luo, Yandong [3 ]
He, Wangxin [1 ]
Sun, Xiaoyu [3 ]
Kim, Jae-Joon [2 ]
Seo, Jae-sun [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA
[2] Pohang Univ Sci & Technol, Dept Creat IT Engn, Pohang, South Korea
[3] Georgia Inst Technol, Sch Elect & Comp Engn, Atlanta, GA 30332 USA
基金
美国国家科学基金会; 新加坡国家研究基金会;
关键词
Deep neural networks;
D O I
10.1109/MM.2019.2943047
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Resistive RAM (RRAM) has been presented as a promising memory technology toward deep neural network (DNN) hardware design, with nonvolatility, high density, high ON/OFF ratio, and compatibility with logic process. However, prior RRAM works for DNNs have shown limitations on parallelism for in-memory computing, array efficiency with large peripheral circuits, multilevel analog operation, and demonstration of monolithic integration. In this article, we propose circuit-/device-level optimizations to improve the energy and density of RRAM-based in-memory computing architectures. We report experimental results based on prototype chip design of 128 x 64 RRAM arrays and CMOS peripheral circuits, where RRAM devices are monolithically integrated in a commercial 90-nm CMOS technology. We demonstrate the CMOS peripheral circuit optimization using input-splitting scheme and investigate the implication of higher low resistance state on energy efficiency and robustness. Employing the proposed techniques, we demonstrate RRAM-based in-memory computing with up to 116.0 TOPS/W energy efficiency and 84.2% CIFAR-10 accuracy. Furthermore, we investigate four-level programming with single RRAM device, and report the system-level performance and DNN accuracy results using circuit-level benchmark simulator NeuroSim.
引用
收藏
页码:54 / 63
页数:10
相关论文
共 17 条
[1]  
[Anonymous], 2019, ARXIV190907514
[2]   NeuroSim: A Circuit-Level Macro Model for Benchmarking Neuro-Inspired Architectures in Online Learning [J].
Chen, Pai-Yu ;
Peng, Xiaochen ;
Yu, Shimeng .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2018, 37 (12) :3067-3080
[3]  
Ho CH, 2017, INT EL DEVICES MEET
[4]  
Hubara I, 2016, ADV NEUR IN, V29
[5]  
Jiang ZW, 2018, 2018 IEEE SYMPOSIUM ON VLSI TECHNOLOGY, P173, DOI 10.1109/VLSIT.2018.8510687
[6]  
Kim Yulhwa., 2018, Proceedings of the International Symposium on Low Power Electronics and Design, P41
[7]   DRISA: A DRAM-based Reconfigurable In-Situ Accelerator [J].
Li, Shuangchen ;
Niu, Dimin ;
Malladi, Krishna T. ;
Zheng, Hongzhong ;
Brennan, Bob ;
Xie, Yuan .
50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, :288-301
[8]  
Mochida R, 2018, 2018 IEEE SYMPOSIUM ON VLSI TECHNOLOGY, P175, DOI 10.1109/VLSIT.2018.8510676
[9]   Deep learning [J].
Rusk, Nicole .
NATURE METHODS, 2016, 13 (01) :35-35
[10]   Ambit: In-Memory Accelerator for Bulk Bitwise Operations Using Commodity DRAM Technology [J].
Seshadri, Vivek ;
Lee, Donghyuk ;
Mullins, Thomas ;
Hassan, Hasan ;
Boroumand, Amirali ;
Kim, Jeremie ;
Kozuch, Michael A. ;
Mutlu, Onur ;
Gibbons, Phillip B. ;
Mowry, Todd C. .
50TH ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE (MICRO), 2017, :273-287