COAC: Cross-Layer Optimization of Accelerator Configurability for Efficient CNN Processing

被引:3
|
作者
Colleman, Steven [1 ]
Shi, Man [1 ]
Verhelst, Marian [1 ]
机构
[1] Katholieke Univ Leuven, ESAT MICAS, B-3000 Leuven, Belgium
基金
欧洲研究理事会;
关键词
Arrays; Hardware; Cross layer design; Neural networks; Convolutional neural networks; Topology; Optimization; Convolutional neural network (CNN); cross-layer; data flow for reconfigurability; modeling of data reformatting;
D O I
10.1109/TVLSI.2023.3268084
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
To achieve high accuracy, convolutional neural networks (CNNs) are increasingly growing in complexity and diversity in layer types and topologies. This makes it very challenging to efficiently deploy such networks on custom processor architectures for resource-scarce edge devices. Existing mapping exploration frameworks enable searching for the optimal execution schedules or hardware mappings of individual network layers, by optimizing each layer's spatial (dataflow parallelization) and temporal unrolling (TU, execution order). However, these tools fail to take into account the overhead of supporting different unrolling schemes within a common hardware architecture. Using a fixed unrolling scheme across all layers is also not ideal, as this misses significant opportunities for energy and latency savings from optimizing the mapping of diverse layer types. A balanced approach assesses the right amount of mapping flexibility needed across target neural networks, while taking into account the overhead to support multiple unrollings. This article, therefore, presents cross-layer optimization of accelerator configurability (COAC), a cross-layer design space exploration and mapping framework to optimize the flexibility of neural processing architectures by balancing configurability overhead against resulting energy and latency savings for end-to-end inference. COAC does not only provide a systematical analysis of the architectural overhead in function of the supported spatial unrollings (SUs), but also builds an automated flow to find the best unrolling combination(s) for efficient end-to-end inference with limited hardware overhead. Results demonstrate that architectures with carefully optimized flexibility can achieve up to 38% energy-delay-product (EDP) savings for a set of six neural networks at the expense of a relative area increase of 9.5%.
引用
收藏
页码:945 / 958
页数:14
相关论文
共 50 条
  • [1] Instruction Driven Cross-Layer CNN Accelerator with Winograd Transformation on FPGA
    Yu, Jincheng
    Hu, Yiming
    Ning, Xuefei
    Qiu, Jiantao
    Guo, Kaiyuan
    Wang, Yu
    Yang, Huazhong
    2017 INTERNATIONAL CONFERENCE ON FIELD PROGRAMMABLE TECHNOLOGY (ICFPT), 2017, : 227 - 230
  • [2] On-chip Instruction Generation for Cross-Layer CNN Accelerator on FPGA
    Hu, Yiming
    Liang, Shuang
    Yu, Jincheng
    Wang, Yu
    Yang, Huazhong
    2019 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2019), 2019, : 7 - 12
  • [3] Instruction Driven Cross-layer CNN Accelerator for Fast Detection on FPGA
    Yu, Jincheng
    Ge, Guangjun
    Hu, Yiming
    Ning, Xuefei
    Qiu, Jiantao
    Guo, Kaiyuan
    Wang, Yu
    Yang, Huazhong
    ACM TRANSACTIONS ON RECONFIGURABLE TECHNOLOGY AND SYSTEMS, 2018, 11 (03)
  • [4] Cross-layer optimization in terminals
    Frascolla, Valerio
    Sue, Jonathan Ah
    Ayub, Muhammad Mudussir
    Miesniak, Krzysztof
    Hasholzner, Ralph
    Englisch, Juergen
    Ben-Ameur, Amal
    2018 26TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2018, : 802 - 806
  • [5] Cross-layer optimization for energy-efficient wireless communications: a survey
    Miao, Guowang
    Himayat, Nageen
    Li, Ye
    Swami, Ananthram
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2009, 9 (04): : 529 - 542
  • [6] Power efficient scheduling over fading channel for cross-layer optimization
    Bai, Xiaofeng
    Shami, Abdallah
    Primak, Serguei
    WIRELESS COMMUNICATIONS & MOBILE COMPUTING, 2012, 12 (13): : 1215 - 1224
  • [7] Cross-Layer Optimization for WiMAX Systems
    Li, Zhe
    Gong, Ting
    2010 6TH INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS NETWORKING AND MOBILE COMPUTING (WICOM), 2010,
  • [8] Cross-layer optimization made practical
    Warrier, Ajit
    Le, Long
    Rhee, Injong
    2007 FOURTH INTERNATIONAL CONFERENCE ON BROADBAND COMMUNICATIONS, NETWORKS & SYSTEMS, VOLS 1 AND 2, 2007, : 733 - 742
  • [9] MLCNN: Cross-Layer Cooperative Optimization and Accelerator Architecture for Speeding Up Deep Learning Applications
    Jiang, Beilei
    Cheng, Xianwei
    Tang, Sihai
    Ma, Xu
    Gu, Zhaochen
    Fu, Song
    Yang, Qing
    Liu, Mingxiong
    2022 IEEE 36TH INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM (IPDPS 2022), 2022, : 1184 - 1194
  • [10] Decentralized Cross-Layer Optimization for Energy-Efficient Resource Allocation in HetNets
    Wang, Yuanshuang
    Liu, Junjun
    Miao, Guowang
    2018 INTERNATIONAL CONFERENCE ON CYBER-ENABLED DISTRIBUTED COMPUTING AND KNOWLEDGE DISCOVERY (CYBERC 2018), 2018, : 470 - 474