On-Device Continual Learning With STT-Assisted-SOT MRAM-Based In-Memory Computing

被引：2

作者：

Zhang, Fan ^{[1
]}

Sridharan, Amitesh ^{[1
]}

Hwang, William ^{[2
]}

Xue, Fen ^{[2
]}

Tsai, Wilman ^{[3
]}

Wang, Shan Xiang ^{[4
,5
]}

Fan, Deliang ^{[1
]}

机构：

[1] Johns Hopkins Univ, Dept Elect & Comp Engn, Baltimore, MD 21218 USA

[2] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA

[3] Stanford Univ, Dept Mat Sci & Engn, Stanford, CA 94305 USA

[4] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA

[5] Stanford Univ, Dept Mat Sci & Engn, Stanford, CA 94305 USA

来源：

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS | 2024年 / 43卷 / 08期

基金：

美国国家科学基金会;

关键词：

Magnetic tunneling; Training; In-memory computing; Task analysis; Quantization (signal); Nonvolatile memory; Resistance; Continual learning; in-memory computing (IMC); MRAM; neural network;

D O I：

10.1109/TCAD.2024.3371268

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Due to the separate memory and computation units in traditional von Neumann architecture, massive data transfer dominates the overall computing system's power and latency, known as the "Memory-Wall" issue. Especially with ever-increasing deep learning-based AI model size and computing complexity, it becomes the bottleneck for state-of-the-art AI computing systems. To address this challenge, in-memory computing (IMC)-based Neural Network accelerators have been widely investigated to support AI computing within memory. However, most of those works focus only on inference. The on-device training and continual learning have not been well explored yet. In this work, for the first time, we introduce on-device continual learning with STT-assisted-SOT (SAS) magnetoresistive random-access memory (MRAM)-based IMC system. On the hardware side, we have fabricated a STT-assisted-SOT MRAM (SAS-MRAM) device prototype with 4 magnetic tunnel junctions (MTJs, each at 100 nm x50 nm) sharing a common heavy metal layer, achieving significantly improved memory writing and area efficiency compared to traditional SOT-MRAM. Next, we designed fully digital IMC circuits with our SAS-MRAM to support both neural network inference and on-device learning. To enable efficient on-device continual learning for new task data, we present an 8-bit integer (INT8)-based continual learning algorithm that utilizes our SAS-MRAM IMC-supported bit-serial digital in-memory convolution operations to train a small parallel reprogramming network (Rep-Net) while freezing the major backbone model. Extensive studies have been presented based on our fabricated SAS-MRAM device prototype, cross-layer device-circuit benchmarking and simulation, as well as the on-device continual learning system evaluation.

引用

页码：2393 / 2404

页数：12

共 47 条

[31] Circuits and Architectures for In-Memory Computing-Based Machine Learning Accelerators [J].

Ankit, Aayush ;

Chakraborty, Indranil ;

Agrawal, Amogh ;

Ali, Mustafa ;

Roy, Kaushik .

IEEE MICRO, 2020, 40 (06) :8-21

[32] TGBNN: Training Algorithm of Binarized Neural Network With Ternary Gradients for MRAM-Based Computing-in-Memory Architecture [J].

Fujiwara, Yuya ;

Kawahara, Takayuki .

IEEE ACCESS, 2024, 12 :150962-150974

[33] In-Memory Wallace Tree Multipliers Based on Majority Gates Within Voltage-Gated SOT-MRAM Crossbar Arrays [J].

Hui, Yajuan ;

Li, Qingzhen ;

Wang, Leimin ;

Liu, Cheng ;

Zhang, Deming ;

Miao, Xiangshui .

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (03) :497-504

[34] Resistive Memory-Based In-Memory Computing: From Device and Large-Scale Integration System Perspectives [J].

Yan, Bonan ;

Li, Bing ;

Qiao, Ximing ;

Xue, Cheng-Xin ;

Chang, Meng-Fan ;

Chen, Yiran ;

Li, Hai .

ADVANCED INTELLIGENT SYSTEMS, 2019, 1 (07)

[35] An in-memory computing multiply-and-accumulate circuit based on ternary STT-MRAMs for convolutional neural networks [J].

Zhao, Guihua ;

Jin, Xing ;

Ye, Huafeng ;

Peng, Yating ;

Liu, Wei ;

Yin, Ningyuan ;

Chen, Weichong ;

Chen, Jianjun ;

Li, Ximing ;

Yu, Zhiyi .

IEICE ELECTRONICS EXPRESS, 2022, 19 (20)

[36] SAMBA: Sparsity Aware In-Memory Computing Based Machine Learning Accelerator [J].

Kim, Dong Eun ;

Ankit, Aayush ;

Wang, Cheng ;

Roy, Kaushik .

IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (09) :2615-2627

[37] Environment-Adaptable Edge-Computing Gas-Sensor Device With Analog-Assisted Continual Learning Scheme [J].

Chae, Hee Young ;

Cho, Jeonghoon ;

Purbia, Rahul ;

Park, Chan Sam ;

Kim, Hyunjoong ;

Lee, Yoonsik ;

Baik, Jeong Min ;

Kim, Jae Joon .

IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (10) :10720-10729

[38] Memristive-based in-memory computing: from device to large-scale CMOS integration [J].

Quesada, E. Perez-Bosch ;

Perez, E. ;

Mahadevaiah, M. Kalishettyhalli ;

Wenger, C. .

NEUROMORPHIC COMPUTING AND ENGINEERING, 2021, 1 (02)

[39] Design Space and Memory Technology Co-exploration for In-Memory Computing Based Machine Learning Accelerators [J].

He, Kang ;

Chakraborty, Indranil ;

Wang, Cheng ;

Roy, Kaushik .

2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,

[40] Sub-femto-Joule energy consumption memory device based on van der Waals heterostructure for in-memory computing [J].

Su, Zi-Jia ;

Xuan, Zi-Hao ;

Liu, Jing ;

Kang, Yi ;

Liu, Chun-Sen ;

Zuo, Cheng-Jie .

CHIP, 2022, 1 (02)

← 1 2 3 4 5 →