Efficient memristor accelerator for transformer self-attention functionality

被引:0
|
作者
Bettayeb, Meriem [1 ,2 ]
Halawani, Yasmin [3 ]
Khan, Muhammad Umair [1 ]
Saleh, Hani [1 ]
Mohammad, Baker [1 ]
机构
[1] Khalifa Univ, Syst Onchip Lab, Comp & Informat Engn, Abu Dhabi, U Arab Emirates
[2] Abu Dhabi Univ, Coll Engn, Comp Sci & Informat Technol Dept, Abu Dhabi, U Arab Emirates
[3] Univ Dubai, Coll Engn & IT, Dubai, U Arab Emirates
来源
SCIENTIFIC REPORTS | 2024年 / 14卷 / 01期
关键词
ARCHITECTURE;
D O I
10.1038/s41598-024-75021-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
The adoption of transformer networks has experienced a notable surge in various AI applications. However, the increased computational complexity, stemming primarily from the self-attention mechanism, parallels the manner in which convolution operations constrain the capabilities and speed of convolutional neural networks (CNNs). The self-attention algorithm, specifically the matrix-matrix multiplication (MatMul) operations, demands a substantial amount of memory and computational complexity, thereby restricting the overall performance of the transformer. This paper introduces an efficient hardware accelerator for the transformer network, leveraging memristor-based in-memory computing. The design targets the memory bottleneck associated with MatMul operations in the self-attention process, utilizing approximate analog computation and the highly parallel computations facilitated by the memristor crossbar architecture. Remarkably, this approach resulted in a reduction of approximately 10 times in the number of multiply-accumulate (MAC) operations in transformer networks, while maintaining 95.47% accuracy for the MNIST dataset, as validated by a comprehensive circuit simulator employing NeuroSim 3.0. Simulation outcomes indicate an area utilization of 6895.7 mu m2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu m<^>2$$\end{document}, a latency of 15.52 seconds, an energy consumption of 3 mJ, and a leakage power of 59.55 mu W\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\mu W$$\end{document}. The methodology outlined in this paper represents a substantial stride towards a hardware-friendly transformer architecture for edge devices, poised to achieve real-time performance.
引用
收藏
页数:15
相关论文
共 50 条
  • [1] An efficient parallel self-attention transformer for CSI feedback
    Liu, Ziang
    Song, Tianyu
    Zhao, Ruohan
    Jin, Jiyu
    Jin, Guiyue
    PHYSICAL COMMUNICATION, 2024, 66
  • [2] In-Memory Transformer Self-Attention Mechanism Using Passive Memristor Crossbar
    Cai, Jack
    Kaleem, Muhammad Ahsan
    Genov, Roman
    Azghadi, Mostafa Rahimi
    Amirsoleimani, Amirali
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [3] Decomformer: Decompose Self-Attention of Transformer for Efficient Image Restoration
    Lee, Eunho
    Hwang, Youngbae
    IEEE ACCESS, 2024, 12 : 38672 - 38684
  • [4] Relative molecule self-attention transformer
    Łukasz Maziarka
    Dawid Majchrowski
    Tomasz Danel
    Piotr Gaiński
    Jacek Tabor
    Igor Podolak
    Paweł Morkisz
    Stanisław Jastrzębski
    Journal of Cheminformatics, 16
  • [5] Relative molecule self-attention transformer
    Maziarka, Lukasz
    Majchrowski, Dawid
    Danel, Tomasz
    Gainski, Piotr
    Tabor, Jacek
    Podolak, Igor
    Morkisz, Pawel
    Jastrzebski, Stanislaw
    JOURNAL OF CHEMINFORMATICS, 2024, 16 (01)
  • [6] Dual-former: Hybrid self-attention transformer for efficient image restoration
    Chen, Sixiang
    Ye, Tian
    Liu, Yun
    Chen, Erkang
    DIGITAL SIGNAL PROCESSING, 2024, 149
  • [7] Universal Graph Transformer Self-Attention Networks
    Dai Quoc Nguyen
    Tu Dinh Nguyen
    Dinh Phung
    COMPANION PROCEEDINGS OF THE WEB CONFERENCE 2022, WWW 2022 COMPANION, 2022, : 193 - 196
  • [8] Sparse self-attention transformer for image inpainting
    Huang, Wenli
    Deng, Ye
    Hui, Siqi
    Wu, Yang
    Zhou, Sanping
    Wang, Jinjun
    PATTERN RECOGNITION, 2024, 145
  • [9] SST: self-attention transformer for infrared deconvolution
    Gao, Lei
    Yan, Xiaohong
    Deng, Lizhen
    Xu, Guoxia
    Zhu, Hu
    INFRARED PHYSICS & TECHNOLOGY, 2024, 140
  • [10] Lite Vision Transformer with Enhanced Self-Attention
    Yang, Chenglin
    Wang, Yilin
    Zhang, Jianming
    Zhang, He
    Wei, Zijun
    Lin, Zhe
    Yuille, Alan
    2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2022, : 11988 - 11998