Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

被引:0
|
作者
Shahhosseini, Sina [1 ]
Albaqsami, Ahmad [1 ]
Jasemi, Masoomeh [1 ,2 ]
Bagherzadeh, Nader [1 ]
机构
[1] Univ Calif Irvine, Elect Engn & Comp Sci Dept, Irvine, CA 92697 USA
[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran
来源
2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020) | 2020年
关键词
Parallelization; Deep Neural Network; Pruning; Partitioning; Hardware Accelerator;
D O I
10.1109/PDP50117.2020.00053
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
As recent neural networks are being improved to be more accurate, their model's size is exponentially growing. Thus, a huge number of parameters requires to be loaded and stored from/in memory hierarchy and computed in processors to perform training or inference phase of neural network processing. Increasing the number of parameters causes a big challenge for real-time deployment since the memory bandwidth improvement's trend cannot keep up with models' complexity growing trend. Although some operations in neural networks processing are computational intensive such as convolutional layer computing, computing dense layers face with memory bandwidth bottleneck. To address the issue, the paper has proposed Partition Pruning for dense layers to reduce the required parameters while taking into consideration parallelization. We evaluated the performance and energy consumption of parallel inference of partitioned models, which showed a 7.72x speedup of performance and a 2.73x reduction in the energy used for computing pruned fully connected layers in TinyVGG16 model in comparison to running the unpruned model on a single accelerator. Besides, our method showed a limited reduction in accuracy while partitioning fully connected layers.
引用
收藏
页码:307 / 311
页数:5
相关论文
共 50 条
  • [41] Class-Separation Preserving Pruning for Deep Neural Networks
    Preet I.
    Boydell O.
    John D.
    IEEE Transactions on Artificial Intelligence, 2024, 5 (01): : 290 - 299
  • [42] A pruning method for neural networks and its application for optimization in electromagnetic
    Guimaraes, FG
    Ramírez, JA
    IEEE TRANSACTIONS ON MAGNETICS, 2004, 40 (02) : 1160 - 1163
  • [43] Deep Neural Networks Pruning via the Structured Perspective Regularization
    Cacciola, Matteo
    Frangioni, Antonio
    Li, Xinlin
    Lodi, Andrea
    SIAM JOURNAL ON MATHEMATICS OF DATA SCIENCE, 2023, 5 (04): : 1051 - 1077
  • [44] Acceleration-Aware Fine-Grained Channel Pruning for Deep Neural Networks via Residual Gating
    Huang, Kai
    Chen, Siang
    Li, Bowen
    Claesen, Luc
    Yao, Hao
    Chen, Junjian
    Jiang, Xiaowen
    Liu, Zhili
    Xiong, Dongliang
    IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2022, 41 (06) : 1902 - 1915
  • [45] Structure Optimization of Artificial Neural Networks Using Pruning Methods
    Ciganek, Jan
    Osusky, Jakub
    2018 CYBERNETICS & INFORMATICS (K&I), 2018,
  • [46] Pruning convolutional neural networks via filter similarity analysis
    Geng, Lili
    Niu, Baoning
    MACHINE LEARNING, 2022, 111 (09) : 3161 - 3180
  • [47] Workload-Balanced Pruning for Sparse Spiking Neural Networks
    Yin, Ruokai
    Kim, Youngeun
    Li, Yuhang
    Moitra, Abhishek
    Satpute, Nitin
    Hambitzer, Anna
    Panda, Priyadarshini
    IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE, 2024, 8 (04): : 2897 - 2907
  • [48] PRUNING OF CONVOLUTIONAL NEURAL NETWORKS USING ISING ENERGY MODEL
    Salehinejad, Hojjat
    Valaee, Shahrokh
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3935 - 3939
  • [49] Subset-based training and pruning of sigmoid neural networks
    Zhou, C
    Si, J
    NEURAL NETWORKS, 1999, 12 (01) : 79 - 89
  • [50] The Hardware Impact of Quantization and Pruning for Weights in Spiking Neural Networks
    Schaefer, Clemens J. S.
    Taheri, Pooria
    Horeni, Mark
    Joshi, Siddharth
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2023, 70 (05) : 1789 - 1793