Optimizing the Deep Neural Networks by Layer-Wise Refined Pruning and the Acceleration on FPGA

被引:16
|
作者
Li, Hengyi [1 ]
Yue, Xuebin [1 ]
Wang, Zhichen [1 ]
Chai, Zhilei [2 ]
Wang, Wenwen [3 ]
Tomiyama, Hiroyuki [1 ]
Meng, Lin [1 ]
机构
[1] Ritsumeikan Univ, Dept Elect & Comp Engn, Kusatsu, Shiga, Japan
[2] Jiangnan Univ, Sch AI & Comp Sci, Wuxi, Peoples R China
[3] Univ Georgia, Dept Comp Sci, Athens, GA USA
关键词
MODEL;
D O I
10.1155/2022/8039281
中图分类号
Q [生物科学];
学科分类号
07 ; 0710 ; 09 ;
摘要
To accelerate the practical applications of artificial intelligence, this paper proposes a high efficient layer-wise refined pruning method for deep neural networks at the software level and accelerates the inference process at the hardware level on a field-programmable gate array (FPGA). The refined pruning operation is based on the channel-wise importance indexes of each layer and the layer-wise input sparsity of convolutional layers. The method utilizes the characteristics of the native networks without introducing any extra workloads to the training phase. In addition, the operation is easy to be extended to various state-of-the-art deep neural networks. The effectiveness of the method is verified on ResNet architecture and VGG networks in terms of dataset CIFAR10, CIFAR100, and ImageNet100. Experimental results show that in terms of ResNet50 on CIFAR10 and ResNet101 on CIFAR100, more than 85% of parameters and Floating-Point Operations are pruned with only 0.35% and 0.40% accuracy loss, respectively. As for the VGG network, 87.05% of parameters and 75.78% of Floating-Point Operations are pruned with only 0.74% accuracy loss for VGG13BN on CIFAR10. Furthermore, we accelerate the networks at the hardware level on the FPGA platform by utilizing the tool Vitis AI. For two threads mode in FPGA, the throughput/fps of the pruned VGG13BN and ResNet101 achieves 151.99 fps and 124.31 fps, respectively, and the pruned networks achieve about 4.3x and 1.8x speed up for VGG13BN and ResNet101, respectively, compared with the original networks on FPGA.
引用
收藏
页数:22
相关论文
共 13 条
  • [1] Layer-Wise Training Convolutional Neural Networks With Smaller Filters for Human Activity Recognition Using Wearable Sensors
    Tang, Yin
    Teng, Qi
    Zhang, Lei
    Min, Fuhong
    He, Jun
    IEEE SENSORS JOURNAL, 2021, 21 (01) : 581 - 592
  • [2] Exact solutions for free vibration analysis of laminated, box and sandwich beams by refined layer-wise theory
    Yang, Yan
    Pagani, Alfonso
    Carrera, Erasmo
    COMPOSITE STRUCTURES, 2017, 175 : 28 - 45
  • [3] Integrated Deep Learning-based Online Layer-wise Surface Prediction of Additive Manufacturing
    Yangue, Emmanuel
    Ye, Zehao
    Kan, Chen
    Liu, Chenang
    MANUFACTURING LETTERS, 2023, 35 : 760 - 769
  • [4] Effects of various information scenarios on layer-wise relevance propagation-based interpretable convolutional neural networks for air handling unit fault diagnosis
    Xiong, Chenglong
    Li, Guannan
    Yan, Ying
    Zhang, Hanyuan
    Xu, Chengliang
    Chen, Liang
    BUILDING SIMULATION, 2024, : 1709 - 1730
  • [5] A High Throughput Acceleration for Hybrid Neural Networks With Efficient Resource Management on FPGA
    Yin, Shouyi
    Tang, Shibin
    Lin, Xinhan
    Ouyang, Peng
    Tu, Fengbin
    Liu, Leibo
    Wei, Shaojun
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2019, 38 (04) : 678 - 691
  • [6] Smart Pruning of Deep Neural Networks Using Curve Fitting and Evolution of Weights
    Islam, Ashhadul
    Belhaouari, Samir Brahim
    MACHINE LEARNING, OPTIMIZATION, AND DATA SCIENCE, LOD 2022, PT II, 2023, 13811 : 62 - 76
  • [7] Developmental Plasticity-Inspired Adaptive Pruning for Deep Spiking and Artificial Neural Networks
    Han, Bing
    Zhao, Feifei
    Zeng, Yi
    Shen, Guobin
    IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2025, 47 (01) : 240 - 251
  • [8] Pruning deep convolutional neural networks for efficient edge computing in condition assessment of infrastructures
    Wu, Rih-Teng
    Singla, Ankush
    Jahanshahi, Mohammad R.
    Bertino, Elisa
    Ko, Bong Jun
    Verma, Dinesh
    COMPUTER-AIDED CIVIL AND INFRASTRUCTURE ENGINEERING, 2019, 34 (09) : 774 - 789
  • [9] Impact of On-chip Interconnect on In-memory Acceleration of Deep Neural Networks
    Krishnan, Gokul
    Mandal, Sumit K.
    Chakrabarti, Chaitali
    Seo, Jae-Sun
    Ogras, Umit Y.
    Cao, Yu
    ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2022, 18 (02)
  • [10] Short-Term Load Forecasting Based on Deep Neural Networks Using LSTM Layer
    Kwon, Bo-Sung
    Park, Rae-Jun
    Song, Kyung-Bin
    JOURNAL OF ELECTRICAL ENGINEERING & TECHNOLOGY, 2020, 15 (04) : 1501 - 1509