Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices

被引:0
|
作者
Du, Congpeng [1 ]
Ko, Seok-Bum [2 ]
Zhang, Hao [1 ]
机构
[1] Ocean Univ China, Fac Informat Sci & Engn, Qingdao, Peoples R China
[2] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK, Canada
关键词
D O I
10.1109/ISCAS58744.2024.10558631
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Transformer-based large language models have gained much attention recently. Due to their superior performance, they are expected to take the place of conventional deep learning methods in many fields of applications, including edge computing. However, transformer models have even more amount of computations and parameters than convolutional neural networks which makes them challenging to be deployed at resource-constrained edge devices. To tackle this problem, in this paper, an efficient FPGA-based binary transformer accelerator is proposed. Within the proposed architecture, an energy efficient matrix multiplication decomposition method is proposed to reduce the amount of computation. Moreover, an efficient binarized Softmax computation method is also proposed to reduce the memory footprint during Softmax computation. The proposed architecture is implemented on Xilinx Zynq Untrascale+ device and implementation results show that the proposed matrix multiplication decomposition method can reduce up to 78% of computation at runtime. The proposed transformer accelerator can achieve improved throughput and energy efficiency compared to previous transformer accelerator designs.
引用
收藏
页数:5
相关论文
共 50 条
  • [31] Energy-Efficient Architecture for FPGA-based Deep Convolutional Neural Networks with Binary Weights
    Duan, Yunzhi
    Li, Shuai
    Zhang, Ruipeng
    Wang, Qi
    Chen, Jienan
    Sobelman, Gerald E.
    2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
  • [32] An Energy-Efficient FPGA-based Matrix Multiplier
    Tan, Yiyu
    Imamura, Toshiyuki
    2017 24TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2017, : 514 - 517
  • [33] An Energy-Efficient Accelerator Based on Hybrid CPU-FPGA Devices for Password Recovery
    Liu, Peng
    Li, Shunbin
    Ding, Qingyuan
    IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (02) : 170 - 181
  • [34] A Fast and Efficient FPGA-based Level Set Hardware Accelerator for Image Segmentation
    Liu Ye
    Xiao Jianbiao
    Wu Fei
    Chang Liang
    Zhou Jun
    JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1525 - 1532
  • [35] A Flexible FPGA-Based Accelerator for Efficient Inference of Multi-Precision CNNs
    Liu, Xinyan
    Wu, Xiao
    Shao, Haiku
    Wang, Zhongfeng
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [36] An FPGA-Based Accelerator Enabling Efficient Support for CNNs with Arbitrary Kernel Sizes
    Wang, Miaoxin
    Wu, Xiao
    Lin, Jun
    Wang, Zhongfeng
    2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
  • [37] Efficient FPGA-Based Accelerator of the L-BFGS Algorithm for IoT Applications
    Xiong, Huiyang
    Xiong, Bohang
    Wang, Wenhao
    Tian, Jing
    Zhu, Hao
    Wang, Zhongfeng
    2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
  • [38] Amoeba: An Efficient and Flexible FPGA-Based Accelerator for Arbitrary-Kernel CNNs
    Wu, Xiao
    Wang, Miaoxin
    Lin, Jun
    Wang, Zhongfeng
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (06) : 1086 - 1099
  • [39] On Exploiting Patterns For Robust FPGA-based Multi-accelerator Edge Computing Systems
    Razavi, Seyyed Ahmad
    Ting, Hsin-Yu
    Giyahchi, Thotiya
    Bozorgzadeh, Eli
    PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 116 - 119
  • [40] An FPGA-based binary neural network accelerator with enhanced hardware efficiency and data reuse
    Zhang, Dezheng
    Cen, Rui
    Pu, Han
    Wan, Rui
    Wang, Dong
    MICROELECTRONICS JOURNAL, 2025, 156