Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices

被引：0

作者：

Du, Congpeng ^{[1
]}

Ko, Seok-Bum ^{[2
]}

Zhang, Hao ^{[1
]}

机构：

[1] Ocean Univ China, Fac Informat Sci & Engn, Qingdao, Peoples R China

[2] Univ Saskatchewan, Dept Elect & Comp Engn, Saskatoon, SK, Canada

来源：

2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年

关键词：

D O I：

10.1109/ISCAS58744.2024.10558631

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Transformer-based large language models have gained much attention recently. Due to their superior performance, they are expected to take the place of conventional deep learning methods in many fields of applications, including edge computing. However, transformer models have even more amount of computations and parameters than convolutional neural networks which makes them challenging to be deployed at resource-constrained edge devices. To tackle this problem, in this paper, an efficient FPGA-based binary transformer accelerator is proposed. Within the proposed architecture, an energy efficient matrix multiplication decomposition method is proposed to reduce the amount of computation. Moreover, an efficient binarized Softmax computation method is also proposed to reduce the memory footprint during Softmax computation. The proposed architecture is implemented on Xilinx Zynq Untrascale+ device and implementation results show that the proposed matrix multiplication decomposition method can reduce up to 78% of computation at runtime. The proposed transformer accelerator can achieve improved throughput and energy efficiency compared to previous transformer accelerator designs.

引用

页数：5

共 50 条

[31] Energy-Efficient Architecture for FPGA-based Deep Convolutional Neural Networks with Binary Weights
Duan, Yunzhi
Li, Shuai
Zhang, Ruipeng
Wang, Qi
Chen, Jienan
Sobelman, Gerald E.
2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 2018,
[32] An Energy-Efficient FPGA-based Matrix Multiplier
Tan, Yiyu
Imamura, Toshiyuki
2017 24TH IEEE INTERNATIONAL CONFERENCE ON ELECTRONICS, CIRCUITS AND SYSTEMS (ICECS), 2017, : 514 - 517
[33] An Energy-Efficient Accelerator Based on Hybrid CPU-FPGA Devices for Password Recovery
Liu, Peng
Li, Shunbin
Ding, Qingyuan
IEEE TRANSACTIONS ON COMPUTERS, 2019, 68 (02) : 170 - 181
[34] A Fast and Efficient FPGA-based Level Set Hardware Accelerator for Image Segmentation
Liu Ye
Xiao Jianbiao
Wu Fei
Chang Liang
Zhou Jun
JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1525 - 1532
[35] A Flexible FPGA-Based Accelerator for Efficient Inference of Multi-Precision CNNs
Liu, Xinyan
Wu, Xiao
Shao, Haiku
Wang, Zhongfeng
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
[36] An FPGA-Based Accelerator Enabling Efficient Support for CNNs with Arbitrary Kernel Sizes
Wang, Miaoxin
Wu, Xiao
Lin, Jun
Wang, Zhongfeng
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
[37] Efficient FPGA-Based Accelerator of the L-BFGS Algorithm for IoT Applications
Xiong, Huiyang
Xiong, Bohang
Wang, Wenhao
Tian, Jing
Zhu, Hao
Wang, Zhongfeng
2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS, 2023,
[38] Amoeba: An Efficient and Flexible FPGA-Based Accelerator for Arbitrary-Kernel CNNs
Wu, Xiao
Wang, Miaoxin
Lin, Jun
Wang, Zhongfeng
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024, 32 (06) : 1086 - 1099
[39] On Exploiting Patterns For Robust FPGA-based Multi-accelerator Edge Computing Systems
Razavi, Seyyed Ahmad
Ting, Hsin-Yu
Giyahchi, Thotiya
Bozorgzadeh, Eli
PROCEEDINGS OF THE 2022 DESIGN, AUTOMATION & TEST IN EUROPE CONFERENCE & EXHIBITION (DATE 2022), 2022, : 116 - 119
[40] An FPGA-based binary neural network accelerator with enhanced hardware efficiency and data reuse
Zhang, Dezheng
Cen, Rui
Pu, Han
Wan, Rui
Wang, Dong
MICROELECTRONICS JOURNAL, 2025, 156

← 1 2 3 4 5 →