In-Memory Transformer Self-Attention Mechanism Using Passive Memristor Crossbar

被引：0

作者：

Cai, Jack ^{[1
]}

Kaleem, Muhammad Ahsan ^{[1
]}

Genov, Roman ^{[1
]}

Azghadi, Mostafa Rahimi ^{[2
]}

Amirsoleimani, Amirali ^{[3
]}

机构：

[1] Univ Toronto, Dept Elect & Comp Engn, Toronto, ON, Canada

[2] James Cook Univ, Coll Sci & Engn, Townsville, Qld 4811, Australia

[3] York Univ, Dept Elect Engn & Comp Sci, Toronto, ON M3J 1P3, Canada

来源：

2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年

关键词：

Memristor; In-Memory; Self-Attention; Neural Network Training; Backpropagation; Transformer;

D O I：

10.1109/ISCAS58744.2024.10558182

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Transformers have emerged as the state-of-the-art architecture for natural language processing (NLP) and computer vision. However, they are inefficient in both conventional and in-memory computing architectures as doubling their sequence length quadruples their time and memory complexity due to their self-attention mechanism. Traditional methods optimize self-attention using memory-efficient algorithms or approximate methods, such as locality-sensitive hashing (LSH) attention that reduces time and memory complexity from 0(L-2) to 0(L log L). In this work, we propose a hardware-level solution that further improves the computational efficiency of LSH attention by utilizing in-memory computing with semi-passive memristor arrays. We demonstrate that LSH can be performed with low-resolution, energy-efficient OT1R arrays performing stochastic memristive vector-matrix multiplication (VMM). Using circuit-level simulation, we show our proposed method is feasible as a drop-in approximation in Large Language Models (LLMs) with no degradation in evaluation metrics. Our results set the foundation for future works on computing the entire transformer architecture in-memory.

引用

页数：5

共 50 条

[21] SST: self-attention transformer for infrared deconvolution
Gao, Lei
Yan, Xiaohong
Deng, Lizhen
Xu, Guoxia
Zhu, Hu
INFRARED PHYSICS & TECHNOLOGY, 2024, 140
[22] Lite Vision Transformer with Enhanced Self-Attention
Yang, Chenglin
Wang, Yilin
Zhang, Jianming
Zhang, He
Wei, Zijun
Lin, Zhe
Yuille, Alan
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11988 - 11998
[23] Synthesizer: Rethinking Self-Attention for Transformer Models
Tay, Yi
Bahri, Dara
Metzler, Donald
Juan, Da-Cheng
Zhao, Zhe
Zheng, Che
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 139, 2021, 139 : 7192 - 7203
[24] SGSAFormer: Spike Gated Self-Attention Transformer and Temporal Attention
Gao, Shouwei
Qin, Yu
Zhu, Ruixin
Zhao, Zirui
Zhou, Hao
Zhu, Zihao
ELECTRONICS, 2025, 14 (01):
[25] Evaluation of Transformer model and Self-Attention mechanism in the Yangtze River basin runoff prediction
Wei, Xikun
Wang, Guojie
Schmalz, Britta
Hagan, Daniel Fiifi Tawia
Duan, Zheng
JOURNAL OF HYDROLOGY-REGIONAL STUDIES, 2023, 47
[26] Radar Signal Sorting With Multiple Self-Attention Coupling Mechanism Based Transformer Network
Zhou, Zixiang
Fu, Xiongjun
Dong, Jian
Gao, Meijing
IEEE SIGNAL PROCESSING LETTERS, 2024, 31 : 1765 - 1769
[27] Slide-Transformer: Hierarchical Vision Transformer with Local Self-Attention
Pan, Xuran
Ye, Tianzhu
Xia, Zhuofan
Song, Shiji
Huang, Gao
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR, 2023, : 2082 - 2091
[28] SIGNAL SEPARATION IN RADIO SPECTRUM USING SELF-ATTENTION MECHANISM
Damara, Fadli
Utkovski, Zoran
Stanczak, Slawomir
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 99 - 100
[29] Neural Named Entity Recognition Using a Self-Attention Mechanism
Zukov-Gregoric, Andrej
Bachrach, Yoram
Minkovsky, Pasha
Coope, Sam
Maksak, Bogdan
2017 IEEE 29TH INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2017), 2017, : 652 - 656
[30] Robust wave-feature adaptive heartbeat classification based on self-attention mechanism using a transformer model
Hu, Shuaicong
Cai, Wenjie
Gao, Tijie
Zhou, Jiajun
Wang, Mingjie
PHYSIOLOGICAL MEASUREMENT, 2021, 42 (12)

← 1 2 3 4 5 →