On-Device Continual Learning With STT-Assisted-SOT MRAM-Based In-Memory Computing

被引:2
作者
Zhang, Fan [1 ]
Sridharan, Amitesh [1 ]
Hwang, William [2 ]
Xue, Fen [2 ]
Tsai, Wilman [3 ]
Wang, Shan Xiang [4 ,5 ]
Fan, Deliang [1 ]
机构
[1] Johns Hopkins Univ, Dept Elect & Comp Engn, Baltimore, MD 21218 USA
[2] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Mat Sci & Engn, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
[5] Stanford Univ, Dept Mat Sci & Engn, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Magnetic tunneling; Training; In-memory computing; Task analysis; Quantization (signal); Nonvolatile memory; Resistance; Continual learning; in-memory computing (IMC); MRAM; neural network;
D O I
10.1109/TCAD.2024.3371268
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the separate memory and computation units in traditional von Neumann architecture, massive data transfer dominates the overall computing system's power and latency, known as the "Memory-Wall" issue. Especially with ever-increasing deep learning-based AI model size and computing complexity, it becomes the bottleneck for state-of-the-art AI computing systems. To address this challenge, in-memory computing (IMC)-based Neural Network accelerators have been widely investigated to support AI computing within memory. However, most of those works focus only on inference. The on-device training and continual learning have not been well explored yet. In this work, for the first time, we introduce on-device continual learning with STT-assisted-SOT (SAS) magnetoresistive random-access memory (MRAM)-based IMC system. On the hardware side, we have fabricated a STT-assisted-SOT MRAM (SAS-MRAM) device prototype with 4 magnetic tunnel junctions (MTJs, each at 100 nm x50 nm) sharing a common heavy metal layer, achieving significantly improved memory writing and area efficiency compared to traditional SOT-MRAM. Next, we designed fully digital IMC circuits with our SAS-MRAM to support both neural network inference and on-device learning. To enable efficient on-device continual learning for new task data, we present an 8-bit integer (INT8)-based continual learning algorithm that utilizes our SAS-MRAM IMC-supported bit-serial digital in-memory convolution operations to train a small parallel reprogramming network (Rep-Net) while freezing the major backbone model. Extensive studies have been presented based on our fabricated SAS-MRAM device prototype, cross-layer device-circuit benchmarking and simulation, as well as the on-device continual learning system evaluation.
引用
收藏
页码:2393 / 2404
页数:12
相关论文
共 47 条
[41]   A Low-Cost Hardware-Friendly Spiking Neural Network Based on Binary MRAM Synapses, Accelerated Using In-Memory Computing [J].
Wang, Yihao ;
Wu, Danqing ;
Wang, Yu ;
Hu, Xianwu ;
Ma, Zizhao ;
Feng, Jiayun ;
Xie, Yufeng .
ELECTRONICS, 2021, 10 (19)
[42]   Real-Time On-Device Continual Learning Based on a Combined Nearest Class Mean and Replay Method for Smartphone Gesture Recognition [J].
Park, Heon-Sung ;
Sung, Min-Kyung ;
Kim, Dae-Won ;
Lee, Jaesung .
SENSORS, 2025, 25 (02)
[43]   Variation-Resilient FeFET-Based In-Memory Computing Leveraging Probabilistic Deep Learning [J].
Manna, Bibhas ;
Saha, Arnob ;
Jiang, Zhouhang ;
Ni, Kai ;
Sengupta, Abhronil .
IEEE TRANSACTIONS ON ELECTRON DEVICES, 2024, 71 (05) :2963-2969
[44]   Development of In-Memory Computing Device Using Positive Feedback Field Effect Transistor Based on NAND Flash Array [J].
Jeong, Hangwook ;
Park, Minseon ;
Kwon, Min-Woo .
IEEE ACCESS, 2025, 13 :45449-45457
[45]   Design of a 5-bit Signed SRAM-based In-Memory Computing Cell for Deep Learning Models [J].
Pereira-Rial, O. ;
Garcia-Lesta, D. ;
Brea, V. M. ;
Lopez, P. ;
Cabello, D. .
2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022,
[46]   HD-CIM: Hybrid-Device Computing-In-Memory Structure Based on MRAM and SRAM to Reduce Weight Loading Energy of Neural Networks [J].
Zhang, He ;
Liu, Junzhan ;
Bai, Jinyu ;
Li, Sai ;
Luo, Lichuan ;
Wei, Shaoqian ;
Wu, Jianxin ;
Kang, Wang .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (11) :4465-4474
[47]   Hardware-aware training for large-scale and diverse deep learning inference workloads using in-memory computing-based accelerators [J].
Rasch, Malte J. ;
Mackin, Charles ;
Le Gallo, Manuel ;
Chen, An ;
Fasoli, Andrea ;
Odermatt, Frederic ;
Li, Ning ;
Nandakumar, S. R. ;
Narayanan, Pritish ;
Tsai, Hsinyu ;
Burr, Geoffrey W. ;
Sebastian, Abu ;
Narayanan, Vijay .
NATURE COMMUNICATIONS, 2023, 14 (01)