Quantized Guided Pruning for Efficient Hardware Implementations of Deep Neural Networks

被引：0

作者：

Hacene, Ghouthi Boukli ^{[1
,2
]}

Gripon, Vincent ^{[2
]}

Arzel, Matthieu ^{[2
]}

Farrugia, Nicolas ^{[2
]}

Bengio, Yoshua ^{[1
]}

机构：

[1] IMT Atlantique, Lab STICC, Nantes, France

[2] Univ Montreal, MILA, Montreal, PQ, Canada

来源：

2020 18TH IEEE INTERNATIONAL NEW CIRCUITS AND SYSTEMS CONFERENCE (NEWCAS'20) | 2020年

关键词：

D O I：

10.1109/newcas49341.2020.9159769

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deep Neural Networks (DNNs) in general and Convolutional Neural Networks (CNNs) in particular are state-of-the-art in numerous computer vision tasks such as object classification and detection. However, the large amount of parameters they contain leads to a high computational complexity and strongly limits their usability in budget-constrained devices such as embedded devices. In this paper, we propose a combination of a pruning technique and a quantization scheme that effectively reduce the complexity and memory usage of convolutional layers of CNNs, by replacing the complex convolutional operation by a low-cost multiplexer. We perform experiments on CIFAR10, CIFAR100 and SVHN datasets and show that the proposed method achieves almost state-of-the-art accuracy, while drastically reducing the computational and memory footprints compared to the baselines. We also propose an efficient hardware architecture, implemented on Field Programmable Gate Arrays (FPGAs), to accelerate inference, which works as a pipeline and accommodates multiple layers working at the same time to speed up the inference process. In contrast with most proposed approaches which have used external memory or software defined memory controllers, our work is based on algorithmic optimization and full-hardware design, enabling a direct, on-chip memory implementation of a DNN while keeping close to state of the art accuracy.

引用

页码：206 / 209

页数：4

共 50 条

[41] An Efficient and Fast Softmax Hardware Architecture (EFSHA) for Deep Neural Networks [J].

Hussain, Muhammad Awais ;

Tsai, Tsung-Han .

2021 IEEE 3RD INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE CIRCUITS AND SYSTEMS (AICAS), 2021,

[42] Efficient Hardware Design of Convolutional Neural Networks for Accelerated Deep Learning [J].

Khalil, Kasem ;

Khan, Md Rahat ;

Bayoumi, Magdy ;

Sherif, Ahmed .

2024 IEEE 67TH INTERNATIONAL MIDWEST SYMPOSIUM ON CIRCUITS AND SYSTEMS, MWSCAS 2024, 2024, :1075-1079

[43] Efficient Hardware Acceleration for Approximate Inference of Bitwise Deep Neural Networks [J].

Vogel, Sebastian ;

Guntoro, Andre ;

Ascheid, Gerd .

2017 CONFERENCE ON DESIGN AND ARCHITECTURES FOR SIGNAL AND IMAGE PROCESSING (DASIP), 2017,

[44] DANoC: An Efficient Algorithm and Hardware Codesign of Deep Neural Networks on Chip [J].

Zhou, Xichuan ;

Li, Shengli ;

Tang, Fang ;

Hu, Shengdong ;

Lin, Zhi ;

Zhang, Lei .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (07) :3176-3187

[45] Efficient Hardware Realization of Convolutional Neural Networks using Intra-Kernel Regular Pruning [J].

Yang, Maurice ;

Faraj, Mahmoud ;

Hussein, Assem ;

Gaudet, Vincent .

2018 IEEE 48TH INTERNATIONAL SYMPOSIUM ON MULTIPLE-VALUED LOGIC (ISMVL 2018), 2018, :180-185

[46] Novel architecture and synapse design for hardware implementations of neural networks [J].

Faculty of Engineering, Magee College, University of Ulster, Northland Road, Derry, BT48 7JL, United Kingdom .

Comput Electr Eng, 1-2 (75-87)

[47] Novel architecture and synapse design for hardware implementations of neural networks [J].

McGinnity, TM ;

Roche, B ;

Maguire, LP ;

McDaid, LJ .

COMPUTERS & ELECTRICAL ENGINEERING, 1998, 24 (1-2) :75-87

[48] HARDWARE IMPLEMENTATIONS OF MLP ARTIFICIAL NEURAL NETWORKS WITH CONFIGURABLE TOPOLOGY [J].

Da Silva, Rodrigo Martins ;

Nedjah, Nadia ;

Mourelle, Luiza De Macedo .

JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2011, 20 (03) :417-437

[49] Structured Pruning of Deep Convolutional Neural Networks [J].

Anwar, Sajid ;

Hwang, Kyuyeon ;

Sung, Wonyong .

ACM JOURNAL ON EMERGING TECHNOLOGIES IN COMPUTING SYSTEMS, 2017, 13 (03)

[50] Activation Pruning of Deep Convolutional Neural Networks [J].

Ardakani, Arash ;

Condo, Carlo ;

Gross, Warren J. .

2017 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP 2017), 2017, :1325-1329

← 1 2 3 4 5 →