A Dynamically Reconfigurable Accelerator Design Using a Sparse-Winograd Decomposition Algorithm for CNNs

被引：2

作者：

Zhao, Yunping ^{[1
]}

Lu, Jianzhuang ^{[1
]}

Chen, Xiaowen ^{[1
]}

机构：

[1] Natl Univ Def Technol, Changsha, Peoples R China

来源：

CMC-COMPUTERS MATERIALS & CONTINUA | 2021年 / 66卷 / 01期

关键词：

High performance computing; accelerator architecture; hardware; NEURAL-NETWORK; CONVOLUTION;

D O I：

10.32604/cmc.2020.012380

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Convolutional Neural Networks (CNNs) are widely used in many fields. Due to their high throughput and high level of computing characteristics, however, an increasing number of researchers are focusing on how to improve the computational efficiency, hardware utilization, or flexibility of CNN hardware accelerators. Accordingly, this paper proposes a dynamically reconfigurable accelerator architecture that implements a Sparse-Winograd F(2 x 2.3 x 3)-based high-parallelism hardware architecture. This approach not only eliminates the pre-calculation complexity associated with the Winograd algorithm, thereby reducing the difficulty of hardware implementation, but also greatly improves the flexibility of the hardware; as a result, the accelerator can realize the calculation of Conventional Convolution, Grouped Convolution (GCONV) or Depthwise Separable Convolution (DSC) using the same hardware architecture. Our experimental results show that the accelerator achieves a 3x-4.14x speedup compared with the designs that do not use the acceleration algorithm on VGG-16 and MobileNet V1. Moreover, compared with previous designs using the traditional Winograd algorithm, the accelerator design achieves 1.4x-1.8x speedup. At the same time, the efficiency of the multiplier improves by up to 142%.

引用

页码：517 / 535

页数：19

共 4 条

[1] An Accelerator Design Using a MTCA Decomposition Algorithm for CNNs
Zhao, Yunping
Lu, Jianzhuang
Chen, Xiaowen
SENSORS, 2020, 20 (19) : 1 - 15
[2] SWM: A High-Performance Sparse-Winograd Matrix Multiplication CNN Accelerator
Wu, Di
Fan, Xitian
Cao, Wei
Wang, Lingli
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2021, 29 (05) : 936 - 949
[3] OctCNN: A High Throughput FPGA Accelerator for CNNs Using Octave Convolution Algorithm
Lou, Wenqi
Lei Gong
Chao Wang
Du, Zidong
Zhou Xuehai
IEEE TRANSACTIONS ON COMPUTERS, 2021, 71 (08) : 1847 - 1859
[4] Rethinking the Designing of Convolution Engine for Reconfigurable CNN Accelerator Using Sparse-Based Design Scheme
Meng, Yishuo
Wang, Jianfei
Xiang, Siwei
Hou, Jia
Lin, Zhijie
Mei, Kuizhi
Yang, Chen
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2025,

← 1 →