Partition Pruning: Parallelization-Aware Pruning for Dense Neural Networks

被引：0

作者：

Shahhosseini, Sina ^{[1
]}

Albaqsami, Ahmad ^{[1
]}

Jasemi, Masoomeh ^{[1
,2
]}

Bagherzadeh, Nader ^{[1
]}

机构：

[1] Univ Calif Irvine, Elect Engn & Comp Sci Dept, Irvine, CA 92697 USA

[2] Sharif Univ Technol, Dept Comp Engn, Tehran, Iran

来源：

2020 28TH EUROMICRO INTERNATIONAL CONFERENCE ON PARALLEL, DISTRIBUTED AND NETWORK-BASED PROCESSING (PDP 2020) | 2020年

关键词：

Parallelization; Deep Neural Network; Pruning; Partitioning; Hardware Accelerator;

D O I：

10.1109/PDP50117.2020.00053

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

As recent neural networks are being improved to be more accurate, their model's size is exponentially growing. Thus, a huge number of parameters requires to be loaded and stored from/in memory hierarchy and computed in processors to perform training or inference phase of neural network processing. Increasing the number of parameters causes a big challenge for real-time deployment since the memory bandwidth improvement's trend cannot keep up with models' complexity growing trend. Although some operations in neural networks processing are computational intensive such as convolutional layer computing, computing dense layers face with memory bandwidth bottleneck. To address the issue, the paper has proposed Partition Pruning for dense layers to reduce the required parameters while taking into consideration parallelization. We evaluated the performance and energy consumption of parallel inference of partitioned models, which showed a 7.72x speedup of performance and a 2.73x reduction in the energy used for computing pruned fully connected layers in TinyVGG16 model in comparison to running the unpruned model on a single accelerator. Besides, our method showed a limited reduction in accuracy while partitioning fully connected layers.

引用

页码：307 / 311

页数：5

共 50 条

[21] Cyclical Pruning for Sparse Neural Networks
Srinivas, Suraj
Kuzmin, Andrey
Nagel, Markus
van Baalen, Mart
Skliar, Andrii
Blankevoort, Tijmen
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2022, 2022, : 2761 - 2770
[22] Methods for Pruning Deep Neural Networks
Vadera, Sunil
Ameen, Salem
IEEE ACCESS, 2022, 10 : 63280 - 63300
[23] GROWING AND PRUNING NEURAL TREE NETWORKS
SANKAR, A
MAMMONE, RJ
IEEE TRANSACTIONS ON COMPUTERS, 1993, 42 (03) : 291 - 299
[24] PRUNING VERSUS CLIPPING IN NEURAL NETWORKS
JANOWSKY, SA
PHYSICAL REVIEW A, 1989, 39 (12): : 6600 - 6603
[25] Multi-objective pruning of dense neural networks using deep reinforcement learning
Hirsch, Lior
Katz, Gilad
INFORMATION SCIENCES, 2022, 610 : 381 - 400
[26] Hessian-Aware Pruning and Optimal Neural Implant
Yu, Shixing
Yao, Zhewei
Gholami, Amir
Dong, Zhen
Kim, Sehoon
Mahoney, Michael W.
Keutzer, Kurt
2022 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2022), 2022, : 3665 - 3676
[27] Pruning-aware Sparse Regularization for Network Pruning
Nan-Fei Jiang
Xu Zhao
Chao-Yang Zhao
Yong-Qi An
Ming Tang
Jin-Qiao Wang
Machine Intelligence Research, 2023, 20 : 109 - 120
[28] Inference-aware convolutional neural network pruning
Choudhary, Tejalal
Mishra, Vipul
Goswami, Anurag
Sarangapani, Jagannathan
FUTURE GENERATION COMPUTER SYSTEMS-THE INTERNATIONAL JOURNAL OF ESCIENCE, 2022, 135 : 44 - 56
[29] Pruning-aware Sparse Regularization for Network Pruning
Jiang, Nan-Fei
Zhao, Xu
Zhao, Chao-Yang
An, Yong-Qi
Tang, Ming
Wang, Jin-Qiao
MACHINE INTELLIGENCE RESEARCH, 2023, 20 (01) : 109 - 120
[30] Pruning-aware Sparse Regularization for Network Pruning
Nan-Fei Jiang
Xu Zhao
Chao-Yang Zhao
Yong-Qi An
Ming Tang
Jin-Qiao Wang
Machine Intelligence Research, 2023, 20 (01) : 109 - 120

← 1 2 3 4 5 →