Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks

被引:0
作者
Hacene, Ghouthi Boukli [1 ,2 ]
Gripon, Vincent [2 ]
Arzel, Matthieu [2 ]
Farrugia, Nicolas [2 ]
Bengio, Yoshua [1 ]
机构
[1] IMT Atlantique, Lab STICC, Nantes, France
[2] Univ Montreal, MILA, Montreal, PQ, Canada
来源
2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20) | 2020年
关键词
D O I
10.1109/newcas49341.2020.9159769
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Deep Neural Networks (DNNs) in general and Convolutional Neural Networks (CNNs) in particular are state-of-the-art in numerous computer vision tasks such as object classification and detection. However, the large amount of parameters they contain leads to a high computational complexity and strongly limits their usability in budget-constrained devices such as embedded devices. In this paper, we propose a combination of a pruning technique and a quantization scheme that effectively reduce the complexity and memory usage of convolutional layers of CNNs, by replacing the complex convolutional operation by a low-cost multiplexer. We perform experiments on CIFAR10, CIFAR100 and SVHN datasets and show that the proposed method achieves almost state-of-the-art accuracy, while drastically reducing the computational and memory footprints compared to the baselines. We also propose an efficient hardware architecture, implemented on Field Programmable Gate Arrays (FPGAs), to accelerate inference, which works as a pipeline and accommodates multiple layers working at the same time to speed up the inference process. In contrast with most proposed approaches which have used external memory or software defined memory controllers, our work is based on algorithmic optimization and full-hardware design, enabling a direct, on-chip memory implementation of a DNN while keeping close to state of the art accuracy.
引用
收藏
页码:206 / 209
页数:4
相关论文
共 50 条
  • [31] Sparse optimization guided pruning for neural networks
    Shi, Yong
    Tang, Anda
    Niu, Lingfeng
    Zhou, Ruizhi
    [J]. NEUROCOMPUTING, 2024, 574
  • [32] Hardware implementations of neural networks and the Random Neural Network Chip (RNNC)
    Aybay, I
    Çerkez, C
    Halici, U
    Badaroglu, M
    [J]. ADVANCES IN COMPUTER AND INFORMATION SCIENCES '98, 1998, 53 : 157 - 161
  • [33] Sparsity in deep learning: Pruning and growth for efficient inference and training in neural networks
    Hoefler, Torsten
    Alistarh, Dan
    Ben-Nun, Tal
    Dryden, Nikoli
    Peste, Alexandra
    [J]. Journal of Machine Learning Research, 2021, 22
  • [34] Pruning Deep Neural Networks for Green Energy-Efficient Models: A Survey
    Tmamna, Jihene
    Ben Ayed, Emna
    Fourati, Rahma
    Gogate, Mandar
    Arslan, Tughrul
    Hussain, Amir
    Ayed, Mounir Ben
    [J]. COGNITIVE COMPUTATION, 2024, 16 (06) : 2931 - 2952
  • [35] Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks
    Hoefler, Torsten
    Alistarh, Dan
    Ben-Nun, Tal
    Dryden, Nikoli
    Peste, Alexandra
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2021, 23
  • [36] Resource-Aware Saliency-Guided Differentiable Pruning for Deep Neural Networks
    Kallakuri, Uttej
    Humes, Edward
    Mohsenin, Tinoosh
    [J]. PROCEEDING OF THE GREAT LAKES SYMPOSIUM ON VLSI 2024, GLSVLSI 2024, 2024, : 694 - 699
  • [37] Compression of Deep Neural Networks based on quantized tensor decomposition to implement on reconfigurable hardware platforms
    Nekooei, Amirreza
    Safari, Saeed
    [J]. NEURAL NETWORKS, 2022, 150 : 350 - 363
  • [38] An Efficient and Fast Softmax Hardware Architecture (EFSHA) for Deep Neural Networks
    Hussain, Muhammad Awais
    Tsai, Tsung-Han
    [J]. 2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,
  • [39] Efficient Hardware Optimization Strategies for Deep Neural Networks Acceleration Chip
    Zhang Meng
    Zhang Jingwei
    Li Guoqing
    Wu Ruixia
    Zeng Xiaoyang
    [J]. JOURNAL OF ELECTRONICS & INFORMATION TECHNOLOGY, 2021, 43 (06) : 1510 - 1517
  • [40] Efficient Hardware Design of Convolutional Neural Networks for Accelerated Deep Learning
    Khalil, Kasem
    Khan, Md Rahat
    Bayoumi, Magdy
    Sherif, Ahmed
    [J]. 2024 IEEE 67TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, MWSCAS 2024, 2024, : 1075 - 1079