Hybrid-Grained Pruning and Hardware Acceleration for Convolutional Neural Networks

被引:0
作者
Li, Yu [1 ]
Cao, Shan [1 ]
Zhao, Beining [1 ]
Zhang, Wei [1 ]
Jiang, Zhiyuan [1 ]
机构
[1] Shanghai Univ, Sch Commun & Informat Engn, Shanghai 200444, Peoples R China
来源
2024 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS 2024 | 2024年
基金
中国国家自然科学基金;
关键词
Pruning; Hybrid-Grained; CNN; Accelerator;
D O I
10.1109/ISCAS58744.2024.10558640
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Throughout various convolutional neural network (CNN) models, the sparsity increases as the network deepens, which poses significant potential to model compression and hardware acceleration. In this paper, a dual-factor hybridgrained pruning method is introduced to make a good balance between model compression and accuracy preservation. The proposed pruning method combines hardware-friendly unstructured vector-level pruning with structured filter-level pruning to explore multiple grains of sparsity in CNNs. The architecture of the corresponding hardware accelerator is then proposed based on the row-based convolution dataflow, which could fully utilize the hybrid sparsity to accelerate CNN processing. Experimental results demonstrate that the proposed method increases the compression rate by 1.08x while causing 0.21% accuracy loss compared to the state-of-the-art filter pruning method in VGG16, and 2.39% hardware resource increase compared to the accelerator without sparsity optimization.
引用
收藏
页数:5
相关论文
共 13 条
[1]   Eyeriss v2: A Flexible Accelerator for Emerging Deep Neural Networks on Mobile Devices [J].
Chen, Yu-Hsin ;
Yange, Tien-Ju ;
Emer, Joel S. ;
Sze, Vivienne .
IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2019, 9 (02) :292-308
[2]   SparTen: A Sparse Tensor Accelerator for Convolutional Neural Networks [J].
Gondimalla, Ashish ;
Chesnut, Noah ;
Thottethodi, Mithuna ;
Vijaykumar, T. N. .
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, :151-165
[3]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[4]   Configurable CNN Accelerator in Speech Processing based on Vector Convolution [J].
Hui, Lanqing ;
Cao, Shan ;
Chen, Zhiyong ;
Li, Shan ;
Xu, Shugong .
2022 IEEE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS 2022): INTELLIGENT TECHNOLOGY IN THE POST-PANDEMIC ERA, 2022, :146-149
[5]   ImageNet Classification with Deep Convolutional Neural Networks [J].
Krizhevsky, Alex ;
Sutskever, Ilya ;
Hinton, Geoffrey E. .
COMMUNICATIONS OF THE ACM, 2017, 60 (06) :84-90
[6]   HRank: Filter Pruning using High-Rank Feature Map [J].
Lin, Mingbao ;
Ji, Rongrong ;
Wang, Yan ;
Zhang, Yichen ;
Zhang, Baochang ;
Tian, Yonghong ;
Shao, Ling .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :1526-1535
[7]  
Liu Z., 2017, Learning Efficient Convolutional Networks Through Network Slimming
[8]   Optimizing the Convolution Operation to Accelerate Deep Neural Networks on FPGA [J].
Ma, Yufei ;
Cao, Yu ;
Vrudhula, Sarma ;
Seo, Jae-sun .
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2018, 26 (07) :1354-1367
[9]  
Munir A., 2024, PHANTOM HIGH PERFORM, P185
[10]  
Parashar Angshuman, 2017, ACM SIGARCH Computer Architecture News, V45, P27, DOI 10.1145/3140659.3080254