In-Memory Transformer Self-Attention Mechanism Using Passive Memristor Crossbar

被引:0
|
作者
Cai, Jack [1 ]
Kaleem, Muhammad Ahsan [1 ]
Genov, Roman [1 ]
Azghadi, Mostafa Rahimi [2 ]
Amirsoleimani, Amirali [3 ]
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON, Canada
[2] James Cook Univ, Coll Sci & Engn, Townsville, Qld 4811, Australia
[3] York Univ, Dept Elect Engn & Comp Sci, Toronto, ON M3J 1P3, Canada
关键词
Memristor; In-Memory; Self-Attention; Neural Network Training; Backpropagation; Transformer;
D O I
10.1109/ISCAS58744.2024.10558182
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Transformers have emerged as the state-of-the-art architecture for natural language processing (NLP) and computer vision. However, they are inefficient in both conventional and in-memory computing architectures as doubling their sequence length quadruples their time and memory complexity due to their self-attention mechanism. Traditional methods optimize self-attention using memory-efficient algorithms or approximate methods, such as locality-sensitive hashing (LSH) attention that reduces time and memory complexity from 0(L-2) to 0(L log L). In this work, we propose a hardware-level solution that further improves the computational efficiency of LSH attention by utilizing in-memory computing with semi-passive memristor arrays. We demonstrate that LSH can be performed with low-resolution, energy-efficient OT1R arrays performing stochastic memristive vector-matrix multiplication (VMM). Using circuit-level simulation, we show our proposed method is feasible as a drop-in approximation in Large Language Models (LLMs) with no degradation in evaluation metrics. Our results set the foundation for future works on computing the entire transformer architecture in-memory.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Attention to Emotions: Body Emotion Recognition In-the-Wild Using Self-attention Transformer Network
    Paiva, Pedro V. V.
    Ramos, Josue J. G.
    Gavrilova, Marina
    Carvalho, Marco A. G.
    COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VISIGRAPP 2023, 2024, 2103 : 206 - 228
  • [32] Local self-attention in transformer for visual question answering
    Shen, Xiang
    Han, Dezhi
    Guo, Zihan
    Chen, Chongqing
    Hua, Jie
    Luo, Gaofeng
    APPLIED INTELLIGENCE, 2023, 53 (13) : 16706 - 16723
  • [33] Local self-attention in transformer for visual question answering
    Xiang Shen
    Dezhi Han
    Zihan Guo
    Chongqing Chen
    Jie Hua
    Gaofeng Luo
    Applied Intelligence, 2023, 53 : 16706 - 16723
  • [34] Vision Transformer Based on Reconfigurable Gaussian Self-attention
    Zhao L.
    Zhou J.-K.
    Zidonghua Xuebao/Acta Automatica Sinica, 2023, 49 (09): : 1976 - 1988
  • [35] Tree Transformer: Integrating Tree Structures into Self-Attention
    Wang, Yau-Shian
    Lee, Hung-Yi
    Chen, Yun-Nung
    2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019): PROCEEDINGS OF THE CONFERENCE, 2019, : 1061 - 1070
  • [36] Maximization of Crossbar Array Memory Using Fundamental Memristor Theory
    Eshraghian, Jason K.
    Cho, Kyoung-Rok
    Iu, Herbert H. C.
    Fernando, Tyrone
    Iannella, Nicolangelo
    Kang, Sung-Mo
    Eshraghian, Kamran
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2017, 64 (12) : 1402 - 1406
  • [37] A lightweight transformer with linear self-attention for defect recognition
    Zhai, Yuwen
    Li, Xinyu
    Gao, Liang
    Gao, Yiping
    ELECTRONICS LETTERS, 2024, 60 (17)
  • [38] An efficient parallel self-attention transformer for CSI feedback
    Liu, Ziang
    Song, Tianyu
    Zhao, Ruohan
    Jin, Jiyu
    Jin, Guiyue
    PHYSICAL COMMUNICATION, 2024, 66
  • [39] Transformer Self-Attention Network for Forecasting Mortality Rates
    Roshani, Amin
    Izadi, Muhyiddin
    Khaledi, Baha-Eldin
    JIRSS-JOURNAL OF THE IRANIAN STATISTICAL SOCIETY, 2022, 21 (01): : 81 - 103
  • [40] Keyword Transformer: A Self-Attention Model for Keyword Spotting
    Berg, Axel
    O'Connor, Mark
    Cruz, Miguel Tairum
    INTERSPEECH 2021, 2021, : 4249 - 4253