On-Device Continual Learning With STT-Assisted-SOT MRAM-Based In-Memory Computing

被引:1
作者
Zhang, Fan [1 ]
Sridharan, Amitesh [1 ]
Hwang, William [2 ]
Xue, Fen [2 ]
Tsai, Wilman [3 ]
Wang, Shan Xiang [4 ,5 ]
Fan, Deliang [1 ]
机构
[1] Johns Hopkins Univ, Dept Elect & Comp Engn, Baltimore, MD 21218 USA
[2] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
[3] Stanford Univ, Dept Mat Sci & Engn, Stanford, CA 94305 USA
[4] Stanford Univ, Dept Elect Engn, Stanford, CA 94305 USA
[5] Stanford Univ, Dept Mat Sci & Engn, Stanford, CA 94305 USA
基金
美国国家科学基金会;
关键词
Magnetic tunneling; Training; In-memory computing; Task analysis; Quantization (signal); Nonvolatile memory; Resistance; Continual learning; in-memory computing (IMC); MRAM; neural network;
D O I
10.1109/TCAD.2024.3371268
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Due to the separate memory and computation units in traditional von Neumann architecture, massive data transfer dominates the overall computing system's power and latency, known as the "Memory-Wall" issue. Especially with ever-increasing deep learning-based AI model size and computing complexity, it becomes the bottleneck for state-of-the-art AI computing systems. To address this challenge, in-memory computing (IMC)-based Neural Network accelerators have been widely investigated to support AI computing within memory. However, most of those works focus only on inference. The on-device training and continual learning have not been well explored yet. In this work, for the first time, we introduce on-device continual learning with STT-assisted-SOT (SAS) magnetoresistive random-access memory (MRAM)-based IMC system. On the hardware side, we have fabricated a STT-assisted-SOT MRAM (SAS-MRAM) device prototype with 4 magnetic tunnel junctions (MTJs, each at 100 nm x50 nm) sharing a common heavy metal layer, achieving significantly improved memory writing and area efficiency compared to traditional SOT-MRAM. Next, we designed fully digital IMC circuits with our SAS-MRAM to support both neural network inference and on-device learning. To enable efficient on-device continual learning for new task data, we present an 8-bit integer (INT8)-based continual learning algorithm that utilizes our SAS-MRAM IMC-supported bit-serial digital in-memory convolution operations to train a small parallel reprogramming network (Rep-Net) while freezing the major backbone model. Extensive studies have been presented based on our fabricated SAS-MRAM device prototype, cross-layer device-circuit benchmarking and simulation, as well as the on-device continual learning system evaluation.
引用
收藏
页码:2393 / 2404
页数:12
相关论文
共 44 条
  • [21] A Study of STT-RAM-based In-Memory Computing Across the Memory Hierarchy
    Gajaria, Dhruv
    Gomez, Kevin Antony
    Adegbija, Tosiron
    2022 IEEE 40TH INTERNATIONAL CONFERENCE ON COMPUTER DESIGN (ICCD 2022), 2022, : 685 - 692
  • [22] SOT and STT-Based 4-Bit MRAM Cell for High-Density Memory Applications
    Nisar, Arshid
    Dhull, Seema
    Mittal, Sparsh
    Kaushik, Brajesh Kumar
    IEEE TRANSACTIONS ON ELECTRON DEVICES, 2021, 68 (09) : 4384 - 4390
  • [23] AM4: MRAM Crossbar Based CAM/TCAM/ACAM/AP for In-Memory Computing
    Garzon, Esteban
    Lanuzza, Marco
    Teman, Adam
    Yavits, Leonid
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2023, 13 (01) : 408 - 421
  • [24] Efficient Time-Domain In-Memory Computing Based on TST-MRAM
    Wang, Jinkai
    Zhang, Yue
    Lian, Chenyu
    Bai, Yining
    Huang, Zhe
    Wang, Guanda
    Zhang, Kun
    Zhang, Youguang
    Zhao, Weisheng
    2020 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS), 2020,
  • [25] Accuracy Improvement With Weight Mapping Strategy and Output Transformation for STT-MRAM-Based Computing-in-Memory
    Wang, Xianggao
    Wei, Na
    Gao, Shifan
    Wu, Wenhao
    Zhao, Yi
    IEEE JOURNAL ON EXPLORATORY SOLID-STATE COMPUTATIONAL DEVICES AND CIRCUITS, 2024, 10 : 75 - 81
  • [26] In-Memory Computing based Machine Learning Accelerators: Opportunities and Challenges
    Roy, Kaushik
    PROCEEDINGS OF THE 32ND GREAT LAKES SYMPOSIUM ON VLSI 2022, GLSVLSI 2022, 2022, : 203 - 204
  • [27] In-Memory Computing Architecture for a Convolutional Neural Network Based on Spin Orbit Torque MRAM
    Huang, Jun-Ying
    Syu, Jing-Lin
    Tsou, Yao-Tung
    Kuo, Sy-Yen
    Chang, Ching-Ray
    ELECTRONICS, 2022, 11 (08)
  • [28] Circuits and Architectures for In-Memory Computing-Based Machine Learning Accelerators
    Ankit, Aayush
    Chakraborty, Indranil
    Agrawal, Amogh
    Ali, Mustafa
    Roy, Kaushik
    IEEE MICRO, 2020, 40 (06) : 8 - 21
  • [29] TGBNN: Training Algorithm of Binarized Neural Network With Ternary Gradients for MRAM-Based Computing-in-Memory Architecture
    Fujiwara, Yuya
    Kawahara, Takayuki
    IEEE ACCESS, 2024, 12 : 150962 - 150974
  • [30] In-Memory Wallace Tree Multipliers Based on Majority Gates Within Voltage-Gated SOT-MRAM Crossbar Arrays
    Hui, Yajuan
    Li, Qingzhen
    Wang, Leimin
    Liu, Cheng
    Zhang, Deming
    Miao, Xiangshui
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (03) : 497 - 504