Toward Efficient Co-Design of CNN Quantization and HW Architecture on FPGA Hybrid-Accelerator

被引：0

作者：

Zhang, Yiran ^{[1
]}

Li, Guiying ^{[1
]}

Yuan, Bo ^{[1
]}

机构：

[1] Southern Univ Sci & Technol, Guangdong Prov Key Lab Brain Inspired Intelligent, Shenzhen, Peoples R China

来源：

2024 INTERNATIONAL SYMPOSIUM OF ELECTRONICS DESIGN AUTOMATION, ISEDA 2024 | 2024年

基金：

中国国家自然科学基金;

关键词：

CNN accelerator; FPGA; DSE method;

D O I：

10.1109/ISEDA62518.2024.10617620

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Field programmable gate array (FPGA) has emerged as a promising platform for accelerating convolutional neural networks (CNNs). In this paper, we propose a low-latency CNN hybrid-accelerator system and an efficient design space exploration (DSE) method. Specifically, our targeted FPGA platform consists of different types of accelerators for two advantages: high concurrency and full hardware utilization (i.e., lookup tables (LUTs) and digital signal processors (DSPs)). Besides, we adopt a bandwidth-aware analytical model for system latency to consider pipeline stalls and computation cycles simultaneously. Furthermore, for the huge design space encompassing layer-wise CNN quantization and FPGA hybrid-accelerator architecture, we propose a DSE method (named DiMEGA) aimed at enhancing search efficiency, which is a differentiable method embedded by a genetic algorithm. The performance of our CNN hybrid-accelerator system is demonstrated on a PYNQ-Z2 FPGA platform. The experimental results show that the system latency can be reduced by 42% similar to 48% without sacrificing accuracy, and the DSE time of DiMEGA is reduced by 23% on ResNet20-CIFAR10, and 63% on ResNet56-CIFAR10, compared with SOTA.

引用

页码：678 / 683

页数：6

共 19 条

[1]

[Anonymous], 2015, 2015 ACMSIGDA INT S, DOI DOI 10.1145/2684746.2689060

[2]

Bengio Y, 2013, Arxiv, DOI arXiv:1308.3432

[3] Automated HW/SW Co-design for Edge AI: State, Challenges and Steps Ahead [J].

Bringmann, Oliver ;

Ecker, Wolfgang ;

Feldner, Ingo ;

Frischknecht, Adrian ;

Gerum, Christoph ;

Hamalainen, Timo ;

Hanif, Muhammad Abdullah ;

Klaiber, Michael J. ;

Mueller-Gritschneder, Daniel ;

Bernardo, Paul Palomero ;

Prebeck, Sebastian ;

Shafique, Muhammad .

2021 INTERNATIONAL CONFERENCE ON HARDWARE/SOFTWARE CODESIGN AND SYSTEM SYNTHESIS (CODES+ISSS 2021), 2021, :11-20

[4]

Choi J, 2018, Arxiv, DOI arXiv:1805.06085

[5]

Fasfous N, 2022, DES AUT TEST EUROPE, P238, DOI 10.23919/DATE54114.2022.9774574

[6] HW-FlowQ: A Multi-Abstraction Level HW-CNN Co-design Quantization Methodology [J].

Fasfous, Nael ;

Vemparala, Manoj Rohit ;

Frickenstein, Alexander ;

Valpreda, Emanuele ;

Salihu, Driton ;

Nguyen Anh Vu Doan ;

Unger, Christian ;

Nagaraja, Naveen Shankar ;

Martina, Maurizio ;

Stechele, Walter .

ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2021, 20 (05)

[7] FPGA/DNN Co-Design: An Efficient Design Methodology for IoT Intelligence on the Edge [J].

Hao, Cong ;

Zhang, Xiaofan ;

Li, Yuhong ;

Huang, Sitao ;

Xiong, Jinjun ;

Rupnow, Kyle ;

Hwu, Wen-mei ;

Chen, Deming .

PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,

[8] EXACT AND APPROXIMATE ALGORITHMS FOR SCHEDULING NONIDENTICAL PROCESSORS [J].

HOROWITZ, E ;

SAHNI, S .

JOURNAL OF THE ACM, 1976, 23 (02) :317-327

[9] Hardware/Software Co-Exploration of Neural Architectures [J].

Jiang, Weiwen ;

Yang, Lei ;

Sha, Edwin Hsing-Mean ;

Zhuge, Qingfeng ;

Gu, Shouzhen ;

Dasgupta, Sakyasingha ;

Shi, Yiyu ;

Hu, Jingtong .

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, 2020, 39 (12) :4805-4815

[10] Accuracy vs. Efficiency: Achieving Both through FPGA-Implementation Aware Neural Architecture Search [J].

Jiang, Weiwen ;

Zhang, Xinyi ;

Sha, Edwin H-M ;

Yang, Lei ;

Zhuge, Qingfeng ;

Shi, Yiyu ;

Hu, Jingtong .

PROCEEDINGS OF THE 2019 56TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2019,

← 1 2 →