CBinfer: Exploiting Frame-to-Frame Locality for Faster Convolutional Network Inference on Video Streams

被引：18

作者：

Cavigelli, Lukas ^{[1
]}

Benini, Luca ^{[1
]}

机构：

[1] Swiss Fed Inst Technol, CH-8092 Zurich, Switzerland

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2020年 / 30卷 / 05期

基金：

瑞士国家科学基金会; 欧盟地平线“2020”;

关键词：

Feature extraction; Object detection; Throughput; Convolution; Inference algorithms; Semantics; Approximation algorithms; Convolutional neural networks; machine learning algorithms; inference algorithms; embedded software; video surveillance; image segmentation; object detection; NEURAL-NETWORKS;

D O I：

10.1109/TCSVT.2019.2903421

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

The last few years have brought advances in computer vision at an amazing pace, grounded on new findings in deep neural network construction and training as well as the availability of large labeled datasets. Applying these networks to images demands a high computational effort and pushes the use of state-of-the-art networks on real-time video data out of reach of embedded platforms. Many recent works focus on reducing network complexity for real-time inference on embedded computing platforms. We adopt an orthogonal viewpoint and propose a novel algorithm exploiting the spatio-temporal sparsity of pixel changes. This optimized inference procedure resulted in an average speed-up of 9.1X over cuDNN on the Tegra X2 platform at a negligible accuracy loss of < 0.1% and no retraining of the network for a semantic segmentation application. Similarly, an average speed-up of 7.0X has been achieved for a pose detection DNN and a reduction of 5X of the number of arithmetic operations to be performed for object detection on static camera video surveillance data. These throughput gains combined with a lower power consumption result in an energy efficiency of 511GOp/s/W compared to 70GOp/s/W for the baseline.

引用

页码：1451 / 1465

页数：15

共 50 条

[21] Computationally Efficient Target Classification in Multispectral Image Data with Deep Neural Networks [J].

Cavigelli, Lukas ;

Bernath, Dominic ;

Magno, Michele ;

Benini, Luca .

TARGET AND BACKGROUND SIGNATURES II, 2016, 9997

[22] Accelerating Real-Time Embedded Scene Labeling with Convolutional Networks [J].

Cavigelli, Lukas ;

Magno, Michele ;

Benini, Luca .

2015 52ND ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC), 2015,

[23] Once for All: A Two-Flow Convolutional Neural Network for Visual Tracking [J].

Chen, Kai ;

Tao, Wenbing .

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2018, 28 (12) :3377-3386

[24]

Chen YH, 2016, ISSCC DIG TECH PAP I, V59, P262, DOI 10.1109/ISSCC.2016.7418007

[25]

Chetlur S., 2014, CUDNN EFFICIENT PRIM

[26] The Cityscapes Dataset for Semantic Urban Scene Understanding [J].

Cordts, Marius ;

Omran, Mohamed ;

Ramos, Sebastian ;

Rehfeld, Timo ;

Enzweiler, Markus ;

Benenson, Rodrigo ;

Franke, Uwe ;

Roth, Stefan ;

Schiele, Bernt .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :3213-3223

[27]

Fischer P., 2015, Flownet: Learning optical flow with convolutional networks

[28]

Goyal P., 2018, Tensor comprehensions: Frameworkagnostic high-performance machine learning abstractions

[29]

Granados Eduardo, 2017, 2017 Conference on Lasers and Electro-Optics Europe & European Quantum Electronics Conference (CLEO/Europe-EQEC), DOI 10.1109/CLEOE-EQEC.2017.8087282

[30]

Gysel P., 2016, HARDWARE ORIENTED AP

← 1 2 3 4 5 →