FlexCNN: An End-to-end Framework for Composing CNN Accelerators on FPGA

被引:19
|
作者
Basalama, Suhail [1 ]
Sohrabizadeh, Atefeh [1 ]
Wang, Jie [1 ]
Guo, Licheng [1 ]
Cong, Jason [1 ]
机构
[1] Univ Calif Los Angeles, 404 Westwood Blvd Engn,6 Room 468, Los Angeles, CA 90095 USA
关键词
FPGA; CNN; ONNX; systolic array; transposed convolution; dilated convolution; OpenPose; U-Net; E-Net;
D O I
10.1145/3570928
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
With reduced data reuse and parallelism, recent convolutional neural networks (CNNs) create new challenges for FPGA acceleration. Systolic arrays (SAs) are efficient, scalable architectures for convolutional layers, but without proper optimizations, their efficiency drops dramatically for reasons: (1) the different dimensions within same-type layers, (2) the different convolution layers especially transposed and dilated convolutions, and (3) CNN's complex dataflow graph. Furthermore, significant overheads arise when integrating FPGAs into machine learning frameworks. Therefore, we present a flexible, composable architecture called FlexCNN, which delivers high computation efficiency by employing dynamic tiling, layer fusion, and data layout optimizations. Additionally, we implement a novel versatile SA to process normal, transposed, and dilated convolutions efficiently. FlexCNN also uses a fully pipelined software-hardware integration that alleviates the software overheads. Moreover, with an automated compilation flow, FlexCNN takes a CNN in the ONNX1 representation, performs a design space exploration, and generates an FPGA accelerator. The framework is tested using three complex CNNs: OpenPose, U-Net, and E-Net. The architecture optimizations achieve 2.3x performance improvement. Compared to a standard SA, the versatile SA achieves close-to-ideal speedups, with up to 15.98x and 13.42x for transposed and dilated convolutions, with a 6% average area overhead. The pipelined integration leads to a 5x speedup for OpenPose.
引用
收藏
页数:32
相关论文
共 50 条
  • [21] End-to-End Synthesis of Dynamically Controlled Machine Learning Accelerators
    Curzel, Serena
    Agostini, Nicolas Bohm
    Castellana, Vito Giovanni
    Minutoli, Marco
    Limaye, Ankur
    Manzano, Joseph
    Zhang, Jeff
    Brooks, David
    Wei, Gu-Yeon
    Ferrandi, Fabrizio
    Tumeo, Antonino
    IEEE TRANSACTIONS ON COMPUTERS, 2022, 71 (12) : 3074 - 3087
  • [22] An approach to the identification of network elements composing heterogeneous end-to-end paths
    Botta, Alessio
    Pescape, Antonio
    Ventre, Giorgio
    COMPUTER NETWORKS, 2008, 52 (15) : 2975 - 2987
  • [23] PiDRAM: An FPGA-based Framework for End-to-end Evaluation of Processing-in-DRAM Techniques
    Olgun, Ataberk
    Luna, Juan Gomez
    Kanellopoulos, Konstantinos
    Salami, Behzad
    Hassan, Hasan
    Ergin, Oguz
    Mutlu, Onur
    2022 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2022), 2022, : 267 - 272
  • [24] A NOVEL BOVW MIMICKING END-TO-END TRAINABLE CNN CLASSIFICATION FRAMEWORK USING OPTIMAL TRANSPORT THEORY
    Gurbuz, Yeti Z.
    Alatan, A. Aydin
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3053 - 3057
  • [25] JOINT VERIFICATION-IDENTIFICATION IN END-TO-END MULTI-SCALE CNN FRAMEWORK FOR TOPIC IDENTIFICATION
    Pappagari, Raghavendra
    Villalba, Jesus
    Dehak, Najim
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 6199 - 6203
  • [26] GETNET: A General End-to-End 2-D CNN Framework for Hyperspectral Image Change Detection
    Wang, Qi
    Yuan, Zhenghang
    Du, Qian
    Li, Xuelong
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2019, 57 (01): : 3 - 13
  • [27] Evaluation of end-to-end CNN models for palm vein recognition
    Santamaria, Jose, I
    Hernandez-Garcia, Ruber
    Barrientos, Ricardo J.
    Manuel Castro, Francisco
    Ramos-Cozar, Julian
    Guil, Nicolas
    2021 40TH INTERNATIONAL CONFERENCE OF THE CHILEAN COMPUTER SCIENCE SOCIETY (SCCC), 2021,
  • [28] End-to-End Mandarin Speech Recognition Combining CNN and BLSTM
    Wang, Dong
    Wang, Xiaodong
    Lv, Shaohe
    SYMMETRY-BASEL, 2019, 11 (05):
  • [29] End-to-End Cascade CNN for Simultaneously Face Detection and Alignment
    Zhao, Sanyuan
    Song, Hongmei
    Cong, Weilin
    Qi, Qi
    Tian, Hui
    2017 INTERNATIONAL CONFERENCE ON VIRTUAL REALITY AND VISUALIZATION (ICVRV 2017), 2017, : 35 - 40
  • [30] CNN-based End-to-End Learning for Lane Centering
    Ebu, Iffat Ara
    Islam, Fahmida
    Ball, John E.
    Goodin, Christopher T.
    AUTONOMOUS SYSTEMS:SENSORS, PROCESSING, AND SECURITY FOR GROUND, AIR, SEA, AND SPACE VEHICLES AND INFRASTRUCTURE 2024, 2024, 13052