FPGA-Based High-Performance Data Compression Deep Neural Network Accelerator

被引：3

作者：

Wang, Hanze ^{[1
]}

Fu, Yingxun ^{[1
]}

Ma, Li ^{[1
]}

机构：

[1] North China Univ Technol, Coll Informat Sci, Beijing, Peoples R China

来源：

2022 INTERNATIONAL CONFERENCE ON BIG DATA, INFORMATION AND COMPUTER NETWORK (BDICN 2022) | 2022年

关键词：

deep neural networks; compression; transmission; fpga;

D O I：

10.1109/BDICN55575.2022.00109

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep neural networks play an important role in extracting valuable information from massive amounts of data. But these networks require huge computational and memory overhead, which hinders their use in resource-limited environments, such as mobile or embedded devices. In order to solve this problem, researchers usually reduce the amount of data and the number of memory accesses to reduce the overhead caused by data transmission. In this paper, we design a compressed storage and calculation fusion (CSCF) algorithm for massive input data to compress the input data volume and improve the processing efficiency of terminal equipment. Firstly, we scan and compress the collected data, then classify and store the compressed data according to the location of consecutive zero-valued pixel blocks. In order to adapt to actual development scenarios, we choose FPGA hardware architecture with high flexibility, low energy consumption, and short development cycle as the terminal processor. Therefore, we design a classification calculation unit corresponding to classification compression and storage on the FPGA architecture, and improve the performance of the model by fusing the first-layer convolution calculation of the convolution neural network and the compression storage of the input data. The evaluation results show that, compared with the traditional neural network accelerator for uncompressed transmission, our CSCF-FPGA accelerator achieves a speedup of 3.8-4.8 times on the MNIST data set and 1.8-2.1 times on the CIFAR series data set. Small fluctuations in speedup ratio and hardware resource utilization show that CSCF-FPGA not only achieves good performance, but also brings no additional hardware loss.

引用

页码：563 / 569

页数：7

共 50 条

[1] VHDL Generator for A High Performance Convolutional Neural Network FPGA-Based Accelerator
Hamdan, Muhammad K.
Rover, Diane T.
2017 INTERNATIONAL CONFERENCE ON RECONFIGURABLE COMPUTING AND FPGAS (RECONFIG), 2017,
[2] FPGA-based hardware accelerator for high-performance data-stream processing
Lysakov K.F.
Shadrin M.Y.
Pattern Recognition and Image Analysis, 2013, 23 (1) : 26 - 34
[3] Throughput optimizations for FPGA-based deep neural network inference
Posewsky, Thorbjoern
Ziener, Daniel
MICROPROCESSORS AND MICROSYSTEMS, 2018, 60 : 151 - 161
[4] A High-Efficiency FPGA-Based Accelerator for Binarized Neural Network
Guo, Peng
Ma, Hong
Chen, Ruizhi
Wang, Donglin
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2019, 28
[5] Implementation of FPGA-based Accelerator for Deep Neural Networks
Tsai, Tsung-Han
Ho, Yuan-Chen
Sheu, Ming-Hwa
2019 IEEE 22ND INTERNATIONAL SYMPOSIUM ON DESIGN AND DIAGNOSTICS OF ELECTRONIC CIRCUITS & SYSTEMS (DDECS), 2019,
[6] FPGA-Based High-Performance Network Impairment Emulator
Duan, Dexuan
Wang, Xinshuo
Li, Lin
Liu, Lei
ELECTRONICS, 2024, 13 (24):
[7] Implementation of Data-optimized FPGA-based Accelerator for Convolutional Neural Network
Cho, Mannhee
Kim, Youngmin
2020 INTERNATIONAL CONFERENCE ON ELECTRONICS, INFORMATION, AND COMMUNICATION (ICEIC), 2020,
[8] Deep Neural Network Accelerator based on FPGA
Thang Viet Huynh
2017 4TH NAFOSTED CONFERENCE ON INFORMATION AND COMPUTER SCIENCE (NICS), 2017, : 254 - 257
[9] An FPGA-based Accelerator Implementation for Deep Convolutional Neural Networks
Zhou, Yongmei
Jiang, Jingfei
PROCEEDINGS OF 2015 4TH INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND NETWORK TECHNOLOGY (ICCSNT 2015), 2015, : 829 - 832
[10] Composite FPGA-based Accelerator for Deep Convolutional Neural Networks
HuanZhang
YuanYang
YangXiao
2019 IEEE INTERNATIONAL CONFERENCE ON ELECTRON DEVICES AND SOLID-STATE CIRCUITS (EDSSC), 2019,

← 1 2 3 4 5 →