HD-CIM: Hybrid-Device Computing-In-Memory Structure Based on MRAM and SRAM to Reduce Weight Loading Energy of Neural Networks

被引：15

作者：

Zhang, He ^{[1
]}

Liu, Junzhan ^{[1
]}

Bai, Jinyu ^{[1
]}

Li, Sai ^{[2
]}

Luo, Lichuan ^{[1
]}

Wei, Shaoqian ^{[1
]}

Wu, Jianxin ^{[1
]}

Kang, Wang ^{[1
]}

机构：

[1] Beihang Univ, Sch Integrated Circuit Sci & Engn, Fert Beijing Inst, Beijing 100191, Peoples R China

[2] Beihang Univ, Shen Yuan Honors Coll, Sch Integrated Circuit Sci & Engn, Fert Beijing Inst, Beijing 100191, Peoples R China

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS | 2022年 / 69卷 / 11期

关键词：

Computing-in-memory (CIM); neural networks (NNs); MRAM; SRAM; EFFICIENT; DESIGN; MACRO; ACCELERATORS; HARDWARE; 6T-SRAM;

D O I：

10.1109/TCSI.2022.3199440

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

SRAM based computing-in-memory (SRAM-CIM) techniques have been widely studied for neural networks (NNs) to solve the "Von Neumann bottleneck". However, as the scale of the NN model increasingly expands, the weight cannot be fully stored on-chip owing to the big device size (limited capacity) of SRAM. In this case, the NN weight data have to be frequently loaded from external memories, such as DRAM and Flash memory, which results in high energy consumption and low efficiency. In this paper, we propose a hybrid-device computing-in-memory (HD-CIM) architecture based on SRAM and MRAM (magnetic random-access memory). In our HD-CIM, the NN weight data are stored in on-chip MRAM and are loaded into SRAM-CIM core, significantly reducing energy and latency. Besides, in order to improve the data transfer efficiency between MRAM and SRAM, a high-speed pipelined MRAM readout structure is proposed to reduce the BL charging time. Our results show that the NN weight data loading energy in our design is only 0.242 pJ/bit, which is 289 x less in comparison with that from off-chip DRAM. Moreover, the energy breakdown and efficiency are analyzed based on different NN models, such as VGG19, ResNetl8 and MobileNetVl. Our design can improve 58 x to 124 x energy efficiency.

引用

页码：4465 / 4474

页数：10

共 45 条

[1]

Antonyan A., 2017, IEEE INT S CIRCU SYS, P1, DOI DOI 10.1109/ISCAS.2017.8050918

[2]

Biswas A, 2018, ISSCC DIG TECH PAP I, P488, DOI 10.1109/ISSCC.2018.8310397

[3]

Caglayan O, 2019, INT CONF ACOUST SPEE, P8648, DOI [10.1109/icassp.2019.8682750, 10.1109/ICASSP.2019.8682750]

[4]

Cai H., 2021, IEEE T CIRCUITS SY 1

[5]

Chang TC, 2020, ISSCC DIG TECH PAP I, P224, DOI 10.1109/ISSCC19947.2020.9063072

[6]

Chen YH, 2016, ISSCC DIG TECH PAP I, V59, P262, DOI 10.1109/ISSCC.2016.7418007

[7] DianNao Family: Energy-Efficient Hardware Accelerators for Machine Learning [J].

Chen, Yunji ;

Chen, Tianshi ;

Xu, Zhiwei ;

Sun, Ninghui ;

Temam, Olivier .

COMMUNICATIONS OF THE ACM, 2016, 59 (11) :105-112

[8] CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference [J].

Chen, Zhiyu ;

Yu, Zhanghao ;

Jin, Qing ;

He, Yan ;

Wang, Jingyu ;

Lin, Sheng ;

Li, Dai ;

Wang, Yanzhi ;

Yang, Kaiyuan .

IEEE JOURNAL OF SOLID-STATE CIRCUITS, 2021, 56 (06) :1924-1935

[9]

Chih YD, 2020, ISSCC DIG TECH PAP I, P222, DOI 10.1109/ISSCC19947.2020.9062955

[10] A Maximally Row-Parallel MRAM In-Memory- Computing Macro Addressing Readout Circuit Sensitivity and Area [J].

Deaville, Peter ;

Zhang, Bonan ;

Chen, Lung-Yen ;

Verma, Naveen .

ESSCIRC 2021 - IEEE 47TH EUROPEAN SOLID STATE CIRCUITS CONFERENCE (ESSCIRC), 2021, :75-78

← 1 2 3 4 5 →