Hybrid RRAM/SRAM in-Memory Computing for Robust DNN Acceleration

被引:14
作者
Krishnan, Gokul [1 ]
Wang, Zhenyu [1 ]
Yeo, Injune [1 ]
Yang, Li [1 ]
Meng, Jian [1 ]
Liehr, Maximilian [2 ]
Joshi, Rajiv, V [3 ]
Cady, Nathaniel C. [2 ]
Fan, Deliang [1 ]
Seo, Jae-Sun [1 ]
Cao, Yu [1 ]
机构
[1] Arizona State Univ, Sch Elect Comp & Energy Engn, Tempe, AZ 85287 USA
[2] State Univ New York Polytech, Dept Nanobiosci, Albany, NY 12203 USA
[3] IBM TJ Watson Res Ctr, Yorktown Hts, NY 10598 USA
基金
美国国家科学基金会;
关键词
Random access memory; Computer architecture; Training; Quantization (signal); Hardware; Performance evaluation; Resistance; In-memory compute; robust deep neural network (DNN) acceleration; RRAM; SRAM;
D O I
10.1109/TCAD.2022.3197516
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs) and other machine learning algorithms. On the other hand, in the presence of RRAM device variations and lower precision, the mapping of DNNs to RRAM-based IMC suffers from severe accuracy loss. In this work, we propose a novel hybrid IMC architecture that integrates an RRAM-based IMC macro with a digital SRAM macro using a programmable shifter to compensate for the RRAM variations and recover the accuracy. The digital SRAM macro consists of a small SRAM memory array and an array of multiply-and-accumulate (MAC) units. The nonideal output from the RRAM macro, due to device and circuit nonidealities, is compensated by adding the precise output from the SRAM macro. In addition, the programmable shifter allows for different scales of compensation by shifting the SRAM macro output relative to the RRAM macro output. On the algorithm side, we develop a framework for the training of DNNs to support the hybrid IMC architecture through ensemble learning. The proposed framework performs quantization (weights and activations), pruning, RRAM IMC-aware training, and employs ensemble learning through different compensation scales by utilizing the programmable shifter. Finally, we design a silicon prototype of the proposed hybrid IMC architecture in the 65-nm SUNY process to demonstrate its efficacy. Experimental evaluation of the hybrid IMC architecture shows that the SRAM compensation allows for a realistic IMC architecture with multilevel RRAM cells (MLCs) even though they suffer from high variations. The hybrid IMC architecture achieves up to 21.9%, 12.65%, and 6.52% improvement in post-mapping accuracy over state-of-the-art techniques, at minimal overhead, for ResNet-20 on CIFAR-10, VGG-16 on CIFAR-10, and ResNet-18 on ImageNet, respectively.
引用
收藏
页码:4241 / 4252
页数:12
相关论文
共 24 条
[1]   GENIEx: A Generalized Approach to Emulating Non-Ideality in Memristive Xbars using Neural Networks [J].
Chakraborty, Indranil ;
Ali, Mustafa Fayez ;
Kim, Dong Eun ;
Ankit, Aayush ;
Roy, Kaushik .
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
[2]   Accurate Inference with Inaccurate RRAM Devices: Statistical Data, Model Transfer, and On-line Adaptation [J].
Charan, Gouranga ;
Hazra, Jubin ;
Beckmann, Karsten ;
Du, Xiaocong ;
Krishnan, Gokul ;
Joshi, Rajiv, V ;
Cady, Nathaniel C. ;
Cao, Yu .
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2020,
[3]  
Chen LR, 2017, DES AUT TEST EUROPE, P19, DOI 10.23919/DATE.2017.7926952
[4]   Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices [J].
Chen, Yu-Hsin ;
Yange, Tien-Ju ;
Emer, Joel S. ;
Sze, Vivienne .
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) :292-308
[5]  
Choi J, 2018, Arxiv, DOI arXiv:1805.06085
[6]   Noise Injection Adaption: End-to-End ReRAM Crossbar Non-ideal Effect Adaption for Neural Network Mapping [J].
He, Zhezhi ;
Lin, Jie ;
Ewetz, Rickard ;
Yuan, Jiann-Shiun ;
Fan, Deliang .
PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,
[7]   FloatPIM: In-Memory Acceleration of Deep Neural Network Training with High Precision [J].
Imani, Mohsen ;
Gupta, Saransh ;
Kim, Yeseong ;
Rosing, Tajana .
PROCEEDINGS OF THE 2019 46TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA '19), 2019, :802-815
[8]   High-Throughput Training of Deep CNNs on ReRAM-Based Heterogeneous Architectures via Optimized Normalization Layers [J].
Joardar, Biresh Kumar ;
Deshwal, Aryan ;
Doppa, Janardhan Rao ;
Pande, Partha Pratim ;
Chakrabarty, Krishnendu .
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (05) :1537-1549
[9]   Accurate deep neural network inference using computational phase-change memory [J].
Joshi, Vinay ;
Le Gallo, Manuel ;
Haefeli, Simon ;
Boybat, Irem ;
Nandakumar, S. R. ;
Piveteau, Christophe ;
Dazzi, Martino ;
Rajendran, Bipin ;
Sebastian, Abu ;
Eleftheriou, Evangelos .
NATURE COMMUNICATIONS, 2020, 11 (01)
[10]  
Krishnan Gakul, 2021, 2021 China Semiconductor Technology International Conference (CSTIC), DOI 10.1109/CSTIC52283.2021.9461480