Efficient FPGA-Based Transformer Accelerator Using In-Block Balanced Pruning

被引：0

作者：

Wang, Saiqun ^{[1
]}

Zhang, Hao ^{[1
]}

机构：

[1] Ocean Univ China, Informat Sci & Engn, Qingdao, Peoples R China

来源：

2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024 | 2024年

关键词：

Transformer Accelerator; Network Pruning; FPGA; Energy-Efficient Computing; Sparse Storage Pattern;

D O I：

10.1109/ICCCAS62034.2024.10651591

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Recently, transformer models have been widely deployed in natural language processing and image processing. However, its superior performance comes with high amount of parameters and computations which make it difficult to deploy transformer models in resource limited devices. To reduce the computation cost of transformer models, in this paper, an improved network pruning method is proposed. In the proposed pruning method, the parameter matrix is decomposed into blocks of a specific size. Then, pruning is applied to each block so that the number of parameters remaining in each block is the same. To further reduce the memory requirement of the parameters, an efficient memory storage pattern for sparse parameters is also proposed in this paper. Finally, by combining the proposed methods, an energy efficient transformer accelerator architecture is proposed. The proposed accelerator is implemented in FPGA devices and implementation results show that the proposed design can significantly improve the speed performance and energy efficiency when compared with previous designs.

引用

页码：18 / 23

页数：6

共 50 条

[1] Energy Efficient FPGA-Based Accelerator for Dynamic Sparse Transformer
Li, Zuohao
Lai, Yiwan
Zhang, Hao
2024 13TH INTERNATIONAL CONFERENCE ON COMMUNICATIONS, CIRCUITS AND SYSTEMS, ICCCAS 2024, 2024, : 7 - 12
[2] Energy Efficient FPGA-Based Binary Transformer Accelerator for Edge Devices
Du, Congpeng
Ko, Seok-Bum
Zhang, Hao
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
[3] An FPGA-Based Transformer Accelerator Using Output Block Stationary Dataflow for Object Recognition Applications
Zhao, Zhongyu
Cao, Rujian
Un, Ka-Fai
Yu, Wei-Han
Mak, Pui-In
Martins, Rui P.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (01) : 281 - 285
[4] An Efficient FPGA-based Depthwise Separable Convolutional Neural Network Accelerator with Hardware Pruning
Liu, Zhengyan
Liu, Qiang
Yan, Shun
Cheung, Ray C. C.
ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2024, 17 (01)
[5] An Efficient FPGA-Based Accelerator Design for Convolution
Song, Peng-Fei
Pan, Jeng-Shyang
Yang, Chun-Sheng
Lee, Chiou-Yng
2017 IEEE 8TH INTERNATIONAL CONFERENCE ON AWARENESS SCIENCE AND TECHNOLOGY (ICAST), 2017, : 494 - 500
[6] An Efficient FPGA-based Accelerator for Deep Forest
Zhu, Mingyu
Luo, Jiapeng
Mao, Wendong
Wang, Zhongfeng
2022 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS (ISCAS 22), 2022, : 3334 - 3338
[7] FPGA-Based Unified Accelerator for Convolutional Neural Network and Vision Transformer
Li T.
Zhang F.
Wang S.
Cao W.
Chen L.
Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (06): : 2663 - 2672
[8] An FPGA-Based Reconfigurable Accelerator for Convolution-Transformer Hybrid EfficientViT
Shao, Haikuo
Shil, Huihong
Mao, Wendong
Wang, Zhongfeng
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024, 2024,
[9] SWAT: An Efficient Swin Transformer Accelerator Based on FPGA
Dong, Qiwei
Xie, Xiaoru
Wang, Zhongfeng
29TH ASIA AND SOUTH PACIFIC DESIGN AUTOMATION CONFERENCE, ASP-DAC 2024, 2024, : 515 - 520
[10] An Optimizing Framework on MLIR for Efficient FPGA-based Accelerator Generation
Zhang, Weichuang
Zhao, Jieru
Shen, Guan
Chen, Quan
Chen, Chen
Guo, Minyi
2024 IEEE INTERNATIONAL SYMPOSIUM ON HIGH-PERFORMANCE COMPUTER ARCHITECTURE, HPCA 2024, 2024, : 75 - 90

← 1 2 3 4 5 →