On-Device Continual Learning With STT-Assisted-SOT MRAM-Based In-Memory Computing

被引:2
作者
Zhang, Fan [1 ]
Sridharan, Amitesh [1 ]
Hwang, William [2 ]
Xue, Fen [2 ]
Tsai, Wilman [3 ]
Wang, Shan Xiang [4 ,5 ]
Fan, Deliang [1 ]
机构
[1] Johns Hopkins Univ, Dept Elect & Comp Engn, Baltimore, MD 21218 USA
[2] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Mat Sci & Engn, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
[5] Stanford Univ, Dept Mat Sci & Engn, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Magnetic tunneling; Training; In-memory computing; Task analysis; Quantization (signal); Nonvolatile memory; Resistance; Continual learning; in-memory computing (IMC); MRAM; neural network;
D O I
10.1109/TCAD.2024.3371268
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the separate memory and computation units in traditional von Neumann architecture, massive data transfer dominates the overall computing system's power and latency, known as the "Memory-Wall" issue. Especially with ever-increasing deep learning-based AI model size and computing complexity, it becomes the bottleneck for state-of-the-art AI computing systems. To address this challenge, in-memory computing (IMC)-based Neural Network accelerators have been widely investigated to support AI computing within memory. However, most of those works focus only on inference. The on-device training and continual learning have not been well explored yet. In this work, for the first time, we introduce on-device continual learning with STT-assisted-SOT (SAS) magnetoresistive random-access memory (MRAM)-based IMC system. On the hardware side, we have fabricated a STT-assisted-SOT MRAM (SAS-MRAM) device prototype with 4 magnetic tunnel junctions (MTJs, each at 100 nm x50 nm) sharing a common heavy metal layer, achieving significantly improved memory writing and area efficiency compared to traditional SOT-MRAM. Next, we designed fully digital IMC circuits with our SAS-MRAM to support both neural network inference and on-device learning. To enable efficient on-device continual learning for new task data, we present an 8-bit integer (INT8)-based continual learning algorithm that utilizes our SAS-MRAM IMC-supported bit-serial digital in-memory convolution operations to train a small parallel reprogramming network (Rep-Net) while freezing the major backbone model. Extensive studies have been presented based on our fabricated SAS-MRAM device prototype, cross-layer device-circuit benchmarking and simulation, as well as the on-device continual learning system evaluation.
引用
收藏
页码:2393 / 2404
页数:12
相关论文
共 47 条
[31]   Circuits and Architectures for In-Memory Computing-Based Machine Learning Accelerators [J].
Ankit, Aayush ;
Chakraborty, Indranil ;
Agrawal, Amogh ;
Ali, Mustafa ;
Roy, Kaushik .
IEEE MICRO, 2020, 40 (06) :8-21
[32]   TGBNN: Training Algorithm of Binarized Neural Network With Ternary Gradients for MRAM-Based Computing-in-Memory Architecture [J].
Fujiwara, Yuya ;
Kawahara, Takayuki .
IEEE ACCESS, 2024, 12 :150962-150974
[33]   In-Memory Wallace Tree Multipliers Based on Majority Gates Within Voltage-Gated SOT-MRAM Crossbar Arrays [J].
Hui, Yajuan ;
Li, Qingzhen ;
Wang, Leimin ;
Liu, Cheng ;
Zhang, Deming ;
Miao, Xiangshui .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (03) :497-504
[34]   Resistive Memory-Based In-Memory Computing: From Device and Large-Scale Integration System Perspectives [J].
Yan, Bonan ;
Li, Bing ;
Qiao, Ximing ;
Xue, Cheng-Xin ;
Chang, Meng-Fan ;
Chen, Yiran ;
Li, Hai .
ADVANCED INTELLIGENT SYSTEMS, 2019, 1 (07)
[35]   An in-memory computing multiply-and-accumulate circuit based on ternary STT-MRAMs for convolutional neural networks [J].
Zhao, Guihua ;
Jin, Xing ;
Ye, Huafeng ;
Peng, Yating ;
Liu, Wei ;
Yin, Ningyuan ;
Chen, Weichong ;
Chen, Jianjun ;
Li, Ximing ;
Yu, Zhiyi .
IEICE ELECTRONICS EXPRESS, 2022, 19 (20)
[36]   SAMBA: Sparsity Aware In-Memory Computing Based Machine Learning Accelerator [J].
Kim, Dong Eun ;
Ankit, Aayush ;
Wang, Cheng ;
Roy, Kaushik .
IEEE TRANSACTIONS ON COMPUTERS, 2023, 72 (09) :2615-2627
[37]   Environment-Adaptable Edge-Computing Gas-Sensor Device With Analog-Assisted Continual Learning Scheme [J].
Chae, Hee Young ;
Cho, Jeonghoon ;
Purbia, Rahul ;
Park, Chan Sam ;
Kim, Hyunjoong ;
Lee, Yoonsik ;
Baik, Jeong Min ;
Kim, Jae Joon .
IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2023, 70 (10) :10720-10729
[38]   Memristive-based in-memory computing: from device to large-scale CMOS integration [J].
Quesada, E. Perez-Bosch ;
Perez, E. ;
Mahadevaiah, M. Kalishettyhalli ;
Wenger, C. .
NEUROMORPHIC COMPUTING AND ENGINEERING, 2021, 1 (02)
[39]   Design Space and Memory Technology Co-exploration for In-Memory Computing Based Machine Learning Accelerators [J].
He, Kang ;
Chakraborty, Indranil ;
Wang, Cheng ;
Roy, Kaushik .
2022 IEEE/ACM INTERNATIONAL CONFERENCE ON COMPUTER AIDED DESIGN, ICCAD, 2022,
[40]   Sub-femto-Joule energy consumption memory device based on van der Waals heterostructure for in-memory computing [J].
Su, Zi-Jia ;
Xuan, Zi-Hao ;
Liu, Jing ;
Kang, Yi ;
Liu, Chun-Sen ;
Zuo, Cheng-Jie .
CHIP, 2022, 1 (02)