CASCADE: Connecting RRAMs to Extend Analog Dataflow In An End-To-End In-Memory Processing Paradigm

被引:69
作者
Chou, Teyuh [1 ]
Tang, Wei [1 ]
Botimer, Jacob [1 ]
Zhang, Zhengya [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
来源
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE | 2019年
关键词
Process in memory; Resistive RAM; Neural network accelerator;
D O I
10.1145/3352460.3358328
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Processing in memory (PIM) is a concept to enable massively parallel dot products while keeping one set of operands in memory. PIM is ideal for computationally demanding deep neural networks (DNNs) and recurrent neural networks (RNNs). Processing in resistive RAM (RRAM) is particularly appealing due to RRAM's high density and low energy. A key limitation of PIM is the cost of multibit analog-to-digital (A/D) conversions that can defeat the efficiency and performance benefits of PIM. In this work, we demonstrate the CASCADE architecture that connects multiply-accumulate (MAC) RRAM arrays with buffer RRAM arrays to extend the processing in analog and in memory: dot products are followed by partial-sum buffering and accumulation to implement a complete DNN or RNN layer. Design choices are made and the interface is designed to enable a variation-tolerant, robust analog dataflow. A new memory mapping scheme named R-Mapping is devised to enable the in-RRAM accumulation of partial sums; and an analog summation scheme is used to reduce the number of A/D conversions required to obtain the final sum. CASCADE is compared with recent in-RRAM computation architectures using state-of-the-art DNN and RNN benchmarks. The results demonstrate that CASCADE improves the energy efficiency by 3.5x while maintaining a competitive throughput.
引用
收藏
页码:114 / 125
页数:12
相关论文
共 47 条
[21]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[22]   Dot-Product Engine for Neuromorphic Computing: Programming 1T1M Crossbar to Accelerate Matrix-Vector Multiplication [J].
Hu, Miao ;
Strachan, John Paul ;
Li, Zhiyong ;
Grafals, Emmanuelle M. ;
Davila, Noraica ;
Graves, Catherine ;
Lam, Sity ;
Ge, Ning ;
Yang, Jianhua ;
Williams, R. Stanley .
2016 ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2016,
[23]  
Indiveri G, 2016, RESISTIVE SWITCHING: FROM FUNDAMENTALS OF NANOIONIC REDOX PROCESSES TO MEMRISTIVE DEVICE APPLICATIONS, P715
[24]   Integration of nanoscale memristor synapses in neuromorphic computing architectures [J].
Indiveri, Giacomo ;
Linares-Barranco, Bernabe ;
Legenstein, Robert ;
Deligeorgis, George ;
Prodromakis, Themistoklis .
NANOTECHNOLOGY, 2013, 24 (38)
[25]   Nanoscale Memristor Device as Synapse in Neuromorphic Systems [J].
Jo, Sung Hyun ;
Chang, Ting ;
Ebong, Idongesit ;
Bhadviya, Bhavitavya B. ;
Mazumder, Pinaki ;
Lu, Wei .
NANO LETTERS, 2010, 10 (04) :1297-1301
[26]   Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2015 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2015, :1026-1034
[27]  
Karpathy A, 2015, PROC CVPR IEEE, P3128, DOI 10.1109/CVPR.2015.7298932
[28]   A Reconfigurable Digital Neuromorphic Processor with Memristive Synaptic Crossbar for Cognitive Computing [J].
Kim, Yongtae ;
Zhang, Yong ;
Li, Peng .
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2015, 11 (04)
[29]   A 3.1 mW 8b 1.2 GS/s Single-Channel Asynchronous SAR ADC With Alternate Comparators for Enhanced Speed in 32 nm Digital SOI CMOS [J].
Kull, Lukas ;
Toifl, Thomas ;
Schmatz, Martin ;
Francese, Pier Andrea ;
Menolfi, Christian ;
Braendli, Matthias ;
Kossel, Marcel ;
Morf, Thomas ;
Andersen, Toke Meyer ;
Leblebici, Yusuf .
IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2013, 48 (12) :3049-3058
[30]  
Li BX, 2013, I SYMPOS LOW POWER E, P242, DOI 10.1109/ISLPED.2013.6629302