In-Memory Transformer Self-Attention Mechanism Using Passive Memristor Crossbar

被引:0
|
作者
Cai, Jack [1 ]
Kaleem, Muhammad Ahsan [1 ]
Genov, Roman [1 ]
Azghadi, Mostafa Rahimi [2 ]
Amirsoleimani, Amirali [3 ]
机构
[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON, Canada
[2] James Cook Univ, Coll Sci & Engn, Townsville, Qld 4811, Australia
[3] York Univ, Dept Elect Engn & Comp Sci, Toronto, ON M3J 1P3, Canada
关键词
Memristor; In-Memory; Self-Attention; Neural Network Training; Backpropagation; Transformer;
D O I
10.1109/ISCAS58744.2024.10558182
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Transformers have emerged as the state-of-the-art architecture for natural language processing (NLP) and computer vision. However, they are inefficient in both conventional and in-memory computing architectures as doubling their sequence length quadruples their time and memory complexity due to their self-attention mechanism. Traditional methods optimize self-attention using memory-efficient algorithms or approximate methods, such as locality-sensitive hashing (LSH) attention that reduces time and memory complexity from 0(L-2) to 0(L log L). In this work, we propose a hardware-level solution that further improves the computational efficiency of LSH attention by utilizing in-memory computing with semi-passive memristor arrays. We demonstrate that LSH can be performed with low-resolution, energy-efficient OT1R arrays performing stochastic memristive vector-matrix multiplication (VMM). Using circuit-level simulation, we show our proposed method is feasible as a drop-in approximation in Large Language Models (LLMs) with no degradation in evaluation metrics. Our results set the foundation for future works on computing the entire transformer architecture in-memory.
引用
收藏
页数:5
相关论文
共 50 条
  • [41] A personalized paper recommendation method based on knowledge graph and transformer encoder with a self-attention mechanism
    Gao, Li
    Lan, Yu
    Yu, Zhen
    Zhu, Jian-min
    APPLIED INTELLIGENCE, 2023, 53 (24) : 29991 - 30008
  • [42] A personalized paper recommendation method based on knowledge graph and transformer encoder with a self-attention mechanism
    Li Gao
    Yu Lan
    Zhen Yu
    Jian-min Zhu
    Applied Intelligence, 2023, 53 : 29991 - 30008
  • [43] Unsupervised Pansharpening Based on Self-Attention Mechanism
    Qu, Ying
    Baghbaderani, Razieh Kaviani
    Qi, Hairong
    Kwan, Chiman
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2021, 59 (04): : 3192 - 3208
  • [44] Linear Complexity Randomized Self-attention Mechanism
    Zheng, Lin
    Wang, Chong
    Kong, Lingpeng
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [45] Keyphrase Generation Based on Self-Attention Mechanism
    Yang, Kehua
    Wang, Yaodong
    Zhang, Wei
    Yao, Jiqing
    Le, Yuquan
    CMC-COMPUTERS MATERIALS & CONTINUA, 2019, 61 (02): : 569 - 581
  • [46] Self-Attention Mechanism in GANs for Molecule Generation
    Chinnareddy, Sandeep
    Grandhi, Pranav
    Narayan, Apurva
    20TH IEEE INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND APPLICATIONS (ICMLA 2021), 2021, : 57 - 60
  • [47] Self-rectifying resistive memory in passive crossbar arrays
    Kanghyeok Jeon
    Jeeson Kim
    Jin Joo Ryu
    Seung-Jong Yoo
    Choongseok Song
    Min Kyu Yang
    Doo Seok Jeong
    Gun Hwan Kim
    Nature Communications, 12
  • [48] Self-rectifying resistive memory in passive crossbar arrays
    Jeon, Kanghyeok
    Kim, Jeeson
    Ryu, Jin Joo
    Yoo, Seung-Jong
    Song, Choongseok
    Yang, Min Kyu
    Jeong, Doo Seok
    Kim, Gun Hwan
    NATURE COMMUNICATIONS, 2021, 12 (01)
  • [49] Self-attention mechanism for distributed compressive sensing
    Shu, Feng
    Zhang, Linghua
    Ding, Yin
    Cheng, Qin
    Wang, Xu
    ELECTRONICS LETTERS, 2022, 58 (10) : 405 - 407
  • [50] Attention Guided CAM: Visual Explanations of Vision Transformer Guided by Self-Attention
    Leem, Saebom
    Seo, Hyunseok
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 4, 2024, : 2956 - 2964