Processor Architecture Optimization for Spatially Dynamic Neural Networks

被引:2
作者
Colleman, Steven [1 ]
Verelst, Thomas [1 ]
Mei, Linyan [1 ]
Tuytelaars, Tinne [1 ]
Verhelst, Marian [1 ]
机构
[1] Katholieke Univ Leuven, ESAT, Dept Elect Engn, Leuven, Belgium
来源
PROCEEDINGS OF THE 2021 IFIP/IEEE INTERNATIONAL CONFERENCE ON VERY LARGE SCALE INTEGRATION (VLSI-SOC) | 2021年
关键词
Spatial dynamic execution; processor design; analytic modelling; scheduling optimizer;
D O I
10.1109/VLSI-SoC53125.2021.9607013
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Spatially dynamic neural networks adjust network execution based on the input data, saving computations by skipping non-important image regions. Yet, GPU implementations fail to achieve speedups from these spatially dynamic execution patterns for most neural network architectures. This paper investigates hardware constraints preventing such speedup and proposes and compares novel processor architectures and dataflows enabling latency improvements due to the dynamic execution with minimal loss of utilization. The presented architectures flexibly support spatial execution of a broad range of networks. For the derived architectures, the spatial unrolling for each layer type is optimized and validated making use of the ZigZag design space exploration framework where appropriate. This allows to benchmark and compare the hardware architectures on NNs for classification and human pose estimation, increasing throughput up to x1.9 and x2.3 compared to their static executions, respectively. This is the same order of magnitude as other dynamic execution methods, while being complementary to those.
引用
收藏
页码:24 / 29
页数:6
相关论文
共 19 条
[1]   SnaPEA: Predictive Early Activation for Reducing Computation in Deep Convolutional Neural Networks [J].
Akhlaghi, Vahideh ;
Yazdanbakhsh, Amir ;
Samadi, Kambiz ;
Gupta, Rajesh K. ;
Esmaeilzadeh, Hadi .
2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), 2018, :662-673
[2]   2D Human Pose Estimation: New Benchmark and State of the Art Analysis [J].
Andriluka, Mykhaylo ;
Pishchulin, Leonid ;
Gehler, Peter ;
Schiele, Bernt .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :3686-3693
[3]  
Bengio E., 2016, ARXIV151106297 CS
[4]  
Deng J, 2009, PROC CVPR IEEE, P248, DOI 10.1109/CVPRW.2009.5206848
[5]   Spatially Adaptive Computation Time for Residual Networks [J].
Figurnov, Michael ;
Collins, Maxwell D. ;
Zhu, Yukun ;
Zhang, Li ;
Huang, Jonathan ;
Vetrov, Dmitry ;
Salakhutdinov, Ruslan .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :1790-1799
[6]   A Cost-Effective CNN Accelerator Design with Configurable PU on FPGA [J].
Fong, Chi Fung Brian ;
Mu, Jiandong ;
Zhang, Wei .
2019 IEEE COMPUTER SOCIETY ANNUAL SYMPOSIUM ON VLSI (ISVLSI 2019), 2019, :31-36
[7]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[8]   Boosting the Performance of CNN Accelerators with Dynamic Fine-Grained Channel Gating [J].
Hua, Weizhe ;
Zhou, Yuan ;
De Sa, Christopher ;
Zhang, Zhiru ;
Suh, G. Edward .
MICRO'52: THE 52ND ANNUAL IEEE/ACM INTERNATIONAL SYMPOSIUM ON MICROARCHITECTURE, 2019, :139-150
[9]  
Lin J, 2017, ADV NEUR IN, V30
[10]   ZigZag: Enlarging Joint Architecture-Mapping Design Space Exploration for DNN Accelerators [J].
Mei, Linyan ;
Houshmand, Pouya ;
Jain, Vikram ;
Giraldo, Sebastian ;
Verhelst, Marian .
IEEE TRANSACTIONS ON COMPUTERS, 2021, 70 (08) :1160-1174