Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA

被引：16

作者：

Li, Hengyi ^{[1
]}

Yue, Xuebin ^{[1
]}

Wang, Zhichen ^{[1
]}

Chai, Zhilei ^{[2
]}

Wang, Wenwen ^{[3
]}

Tomiyama, Hiroyuki ^{[1
]}

Meng, Lin ^{[1
]}

机构：

[1] Ritsumeikan Univ, Dept Elect & Comp Engn, Kusatsu, Shiga, Japan

[2] Jiangnan Univ, Sch AI & Comp Sci, Wuxi, Peoples R China

[3] Univ Georgia, Dept Comp Sci, Athens, GA USA

来源：

COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE | 2022年 / 2022卷

关键词：

MODEL;

D O I：

10.1155/2022/8039281

中图分类号：

Q [生物科学];

学科分类号：

07 ; 0710 ; 09 ;

摘要：

To accelerate the practical applications of artificial intelligence, this paper proposes a high efficient layer-wise refined pruning method for deep neural networks at the software level and accelerates the inference process at the hardware level on a field-programmable gate array (FPGA). The refined pruning operation is based on the channel-wise importance indexes of each layer and the layer-wise input sparsity of convolutional layers. The method utilizes the characteristics of the native networks without introducing any extra workloads to the training phase. In addition, the operation is easy to be extended to various state-of-the-art deep neural networks. The effectiveness of the method is verified on ResNet architecture and VGG networks in terms of dataset CIFAR10, CIFAR100, and ImageNet100. Experimental results show that in terms of ResNet50 on CIFAR10 and ResNet101 on CIFAR100, more than 85% of parameters and Floating-Point Operations are pruned with only 0.35% and 0.40% accuracy loss, respectively. As for the VGG network, 87.05% of parameters and 75.78% of Floating-Point Operations are pruned with only 0.74% accuracy loss for VGG13BN on CIFAR10. Furthermore, we accelerate the networks at the hardware level on the FPGA platform by utilizing the tool Vitis AI. For two threads mode in FPGA, the throughput/fps of the pruned VGG13BN and ResNet101 achieves 151.99 fps and 124.31 fps, respectively, and the pruned networks achieve about 4.3x and 1.8x speed up for VGG13BN and ResNet101, respectively, compared with the original networks on FPGA.

引用

页数：22

共 13 条

[1] Layer-Wise Training Convolutional Neural Networks With Smaller Filters for Human Activity Recognition Using Wearable Sensors
Tang, Yin
Teng, Qi
Zhang, Lei
Min, Fuhong
He, Jun
IEEE SENSORS JOURNAL, 2021, 21 (01) : 581 - 592
[2] Exact solutions for free vibration analysis of laminated, box and sandwich beams by refined layer-wise theory
Yang, Yan
Pagani, Alfonso
Carrera, Erasmo
COMPOSITE STRUCTURES, 2017, 175 : 28 - 45
[3] Integrated Deep Learning-based Online Layer-wise Surface Prediction of Additive Manufacturing
Yangue, Emmanuel
Ye, Zehao
Kan, Chen
Liu, Chenang
MANUFACTURING LETTERS, 2023, 35 : 760 - 769
[4] Effects of various information scenarios on layer-wise relevance propagation-based interpretable convolutional neural networks for air handling unit fault diagnosis
Xiong, Chenglong
Li, Guannan
Yan, Ying
Zhang, Hanyuan
Xu, Chengliang
Chen, Liang
BUILDING SIMULATION, 2024, : 1709 - 1730
[5] A High Throughput Acceleration for Hybrid Neural Networks With Efficient Resource Management on FPGA
Yin, Shouyi
Tang, Shibin
Lin, Xinhan
Ouyang, Peng
Tu, Fengbin
Liu, Leibo
Wei, Shaojun
IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (04) : 678 - 691
[6] Smart Pruning of Deep Neural Networks Using Curve Fitting and Evolution of Weights
Islam, Ashhadul
Belhaouari, Samir Brahim
MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2022, PT II, 2023, 13811 : 62 - 76
[7] Developmental Plasticity-Inspired Adaptive Pruning for Deep Spiking and Artificial Neural Networks
Han, Bing
Zhao, Feifei
Zeng, Yi
Shen, Guobin
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (01) : 240 - 251
[8] Pruning deep convolutional neural networks for efficient edge computing in condition assessment of infrastructures
Wu, Rih-Teng
Singla, Ankush
Jahanshahi, Mohammad R.
Bertino, Elisa
Ko, Bong Jun
Verma, Dinesh
COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2019, 34 (09) : 774 - 789
[9] Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural Networks
Krishnan, Gokul
Mandal, Sumit K.
Chakrabarti, Chaitali
Seo, Jae-Sun
Ogras, Umit Y.
Cao, Yu
ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (02)
[10] Short-Term Load Forecasting Based on Deep Neural Networks Using LSTM Layer
Kwon, Bo-Sung
Park, Rae-Jun
Song, Kyung-Bin
JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2020, 15 (04) : 1501 - 1509

← 1 2 →