Design Space Exploration for Layer-parallel Execution of Convolutional Neural Networks on CGRAs

被引:8
作者
Heidorn, Christian [1 ]
Hannig, Frank [1 ]
Teich, Jurgen [1 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg FAU, Dept Comp Sci, Hardware Software Codesign, Erlangen, Germany
来源
PROCEEDINGS OF THE 23RD INTERNATIONAL WORKSHOP ON SOFTWARE AND COMPILERS FOR EMBEDDED SYSTEMS (SCOPES 2020) | 2020年
关键词
CNN Accelerators; Design Space Exploration; CGRA;
D O I
10.1145/3378678.3391878
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this work, we systematically explore the design space of throughput, energy, and hardware costs for layer-parallel mappings of Convolutional Neural Networks (CNNs) onto coarse-grained reconfigurable arrays (CGRAs). We derive an analytical model that computes the required resources (processing elements) and buffer memory and thus hardware cost C to sustain a given throughput T as well as the resulting overall energy consumption E for inference. Further, we propose an efficient design space exploration (DSE) to determine the fronts of Pareto-optimal (T,E,C) solutions. This exploration helps to determine the limits of scalability of the presented tiled CGRA accelerator architectures in terms of throughput, the number of parallel layers that can be simultaneously processed, and memory requirements. Finally, we provide an evaluation of energy savings achievable on our architecture in comparison to implementations that execute sequentially a CNN layer-by-layer. In experiments, it is shown that layer-parallel processing is able to reduce energy consumption E by 3.6x, hardware cost C by 1.2x, and increase the achievable throughput T by 6.2x for MobileNet.
引用
收藏
页码:26 / 31
页数:6
相关论文
共 32 条