Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity

被引：1

作者：

Grimaldi, Matteo ^{[1
]}

Ganji, Darshan C. ^{[1
]}

Lazarevich, Ivan ^{[1
]}

Sah, Sudhakar ^{[1
]}

机构：

[1] Deeplite, Toronto, ON, Canada

来源：

2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION WORKSHOPS, ICCVW | 2023年

关键词：

D O I：

10.1109/ICCVW60793.2023.00127

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The demand for efficient processing of deep neural networks (DNNs) on embedded devices is a significant challenge limiting their deployment. Exploiting sparsity in the network's feature maps is one of the ways to reduce its inference latency. It is known that unstructured sparsity results in lower accuracy degradation with respect to structured sparsity but the former needs extensive inference engine changes to get latency benefits. To tackle this challenge, we propose a solution to induce semi-structured activation sparsity exploitable through minor runtime modifications. To attain high speedup levels at inference time, we design a sparse training procedure with awareness of the final position of the activations while computing the General Matrix Multiplication (GEMM). We extensively evaluate the proposed solution across various models for image classification and object detection tasks. Remarkably, our approach yields a speed improvement of 1.25x with a minimal accuracy drop of 1.1% for the ResNet18 model on the ImageNet dataset. Furthermore, when combined with a state-of-the-art structured pruning method, the resulting models provide a good latency-accuracy trade-off, outperforming models that solely employ structured pruning techniques. The code is available at https://github.com/Deeplite/activ-sparse.

引用

页码：1171 / 1180

页数：10

共 51 条

[1] Ahn B. H., 2020, MLSYS, V2, P44
[2] Core-Shell-Structured Particle Reinforced A356 Matrix Composite Prepared by Powder-Thixoforming: Effect of Reheating Temperature
Chen, Tijun
Geng, Libo
Qin, He
Gao, Min
[J]. MATERIALS, 2018, 11 (09)
[3] Model Compression and Hardware Acceleration for Neural Networks: A Comprehensive Survey
Deng, Lei
Li, Guoqi
Han, Song
Shi, Luping
Xie, Yuan
[J]. PROCEEDINGS OF THE IEEE, 2020, 108 (04) : 485 - 532
[4] Dong X, 2017, ADV NEUR IN, V30
[5] More is Less: A More Complicated Network with Less Inference Complexity
Dong, Xuanyi
Huang, Junshi
Yang, Yi
Yan, Shuicheng
[J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 1895 - 1903
[6] Dukhan M., 2019, ARXIV COMPUTER VISIO
[7] Temporal Decorrelation of Tropical Dense Forest at C-Band: First Insights From the TropiScat-2 Experiment
Essebtey, S. El Idrissi
Villard, L.
Borderies, P.
Koleck, T.
Monvoisin, J. P.
Burban, B.
Le Toan, T.
[J]. IEEE GEOSCIENCE AND REMOTE SENSING LETTERS, 2020, 17 (06) : 928 - 932
[8] DepGraph: Towards Any Structural Pruning
Fang, Gongfan
Ma, Xinyin
Song, Mingli
Mi, Michael Bi
Wang, Xinchao
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 16091 - 16101
[9] Accelerating Convolutional Neural Networks via Activation Map Compression
Georgiadis, Georgios
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 7078 - 7088
[10] Ghiasi G, 2018, ADV NEUR IN, V31

← 1 2 3 4 5 6 →