Optimizing OpenCL-Based CNN Design on FPGA with Comprehensive Design Space Exploration and Collaborative Performance Modeling

被引:9
|
作者
Mu, Jiandong [1 ]
Zhang, Wei [1 ]
Liang, Hao [2 ]
Sinha, Sharad [3 ]
机构
[1] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[2] Alibaba Grp, Hangzhou, Peoples R China
[3] Indian Inst Technol IIT, Veling, Goa, India
关键词
CNN; modeling; hardware design; design space exploration;
D O I
10.1145/3397514
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Recent success in applying convolutional neural networks (CNNs) to object detection and classification has sparked great interest in accelerating CNNs using hardware-like field-programmable gate arrays (FPGAs). However, finding an efficient FPGA design for a given CNN model and FPGA board is not trivial since a strong background in hardware design and detailed knowledge of the target board are required. In this work, we try to solve this problem by design space exploration with a collaborative framework. Our framework consists of three main parts: FPGA design generation, coarse-grained modeling, and fine-grained modeling. In the FPGA design generation, we propose a novel data structure, LoopTree, to capture the details of the FPGA design for CNN applications without writing down the source code. Different LoopTrees, which indicate different FPGA designs, are automatically generated in this process. A coarse-grained model will evaluate LoopTrees at the operation level, e.g., add, mult, and so on, so that the most efficient LoopTrees can be selected. A fine-grained model, which is based on the source code, will then refine the selected design in a cycle-accurate manner. A set of comprehensive OpenCL-based designs have been implemented on board to verify our framework. An average estimation error of 8.87% and 4.8% has been observed for our coarse-grained model and fine-grained model, respectively. This is much lower than the prevalent operation-statistics-based estimation, which is obtained according to a predefined formula for specific loop schedules.
引用
收藏
页数:28
相关论文
共 50 条
  • [1] A Collaborative Framework for FPGA-based CNN Design Modeling and Optimization
    Mu, Jiandong
    Zhang, Wei
    Liang, Hao
    Sinha, Sharad
    2018 28TH INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE LOGIC AND APPLICATIONS (FPL), 2018, : 139 - 146
  • [2] An OpenCL-Based Hybrid CNN-RNN Inference Accelerator On FPGA
    Sun, Yunfei
    Liu, Brian
    Xu, Xianchao
    2019 INTERNATIONAL CONFERENCE ON FIELD-PROGRAMMABLE TECHNOLOGY (ICFPT 2019), 2019, : 283 - 286
  • [3] Efficient Design Space Exploration of OpenCL Kernels for FPGA Targets Using Black Box Optimization
    Ghaffari, Alireza
    Savaria, Yvon
    IEEE ACCESS, 2021, 9 : 136819 - 136830
  • [4] Design Space Exploration in an FPGA-Based Software Defined Radio
    Gautier, Matthieu
    Ouedraogo, Ganda Stephane
    Sentieys, Olivier
    2014 17TH EUROMICRO CONFERENCE ON DIGITAL SYSTEM DESIGN (DSD), 2014, : 22 - 27
  • [5] Design Space Exploration for the Implementation of a Predictive Current Controller based on FPGA
    Martin, Pedro
    Machado, Osmell
    Rodriguez, Francisco J.
    Bueno, Emilio J.
    2012 IEEE 23RD INTERNATIONAL CONFERENCE ON APPLICATION-SPECIFIC SYSTEMS, ARCHITECTURES AND PROCESSORS (ASAP), 2012, : 161 - 164
  • [6] Design Space Exploration for CNN Offloading to FPGAs at the Edge
    Korol, Guilherme
    Jordan, Michael Guilherme
    Rutzig, Mateus Beck
    Castrillon, Jeronimo
    Schneider Beck, Antonio Carlos
    2023 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI, ISVLSI, 2023, : 276 - 281
  • [7] High-level power and performance estimation of FPGA-based soft processors and its application to design space exploration
    Powell, Adam
    Savvas-Bouganis, Christos
    Cheung, Peter Y. K.
    JOURNAL OF SYSTEMS ARCHITECTURE, 2013, 59 (10) : 1144 - 1156
  • [8] ZIP-CNN: Design Space Exploration for CNN Implementation within a MCU
    Garbay, Thomas
    Hachicha, Khalil
    Dobias, Petr
    Pinna, Andrea
    Hocine, Karim
    Dron, Wilfried
    Lusich, Pedro
    Khalis, Imane
    Granado, Bertrand
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2025, 24 (01)
  • [9] ACDSE: A Design Space Exploration Method for CNN Accelerator based on Adaptive Compression Mechanism
    Feng, Kaijie
    Fan, Xiaoya
    An, Jianfeng
    Li, Chuxi
    Di, Kaiyue
    Li, Jiangfei
    ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS, 2023, 22 (06)
  • [10] Design space exploration for high-performance greenhouse design
    Fang, Demi
    Arsano, Alpha
    Brown, Nathan
    Reinhart, Christoph
    Mueller, Caitlin
    IASS 60TH ANNIVERSARY SYMPOSIUM (IASS SYMPOSIUM 2019) - 9TH INTERNATIONAL CONFERENCE ON TEXTILE COMPOSITES AND INFLATABLE STRUCTURES (STRUCTURAL MEMBRANES 2019), 2019, : 1104 - 1113