Skip-Convolutions for Efficient Video Processing

被引:28
作者
Habibian, Amirhossein [1 ]
Abati, Davide [1 ]
Cohen, Taco S. [1 ]
Bejnordi, Babak Ehteshami [1 ]
机构
[1] Qualcomm AI Res, San Diego, CA 92121 USA
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
D O I
10.1109/CVPR46437.2021.00272
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We propose Skip-Convolutions to leverage the large amount of redundancies in video streams and save computations. Each video is represented as a series of changes across frames and network activations, denoted as residuals. We reformulate standard convolution to be efficiently computed on residual frames: each layer is coupled with a binary gate deciding whether a residual is important to the model prediction, e.g. foreground regions, or it can be safely skipped, e.g. background regions. These gates can either be implemented as an efficient network trained jointly with convolution kernels, or can simply skip the residuals based on their magnitude. Gating functions can also incorporate block-wise sparsity structures, as required for efficient implementation on hardware platforms. By replacing all convolutions with Skip-Convolutions in two state-of-the-art architectures, namely EfficientDet and HRNet, we reduce their computational cost consistently by a factor of 3 similar to 4x for two different tasks, without any accuracy drop. Extensive comparisons with existing model compression, as well as image and video efficiency methods demonstrate that Skip-Convolutions set a new state-of-the-art by effectively exploiting the temporal redundancies in videos.
引用
收藏
页码:2694 / 2703
页数:10
相关论文
共 62 条
  • [11] Bengio Emmanuel, 2015, ARXIV151106297
  • [12] Campos V., 2018, INT C LEARN REPR
  • [13] Cohen Taco S., 2019, NEURIPS, V32
  • [14] COHEN TS, 2016, ICML, V48
  • [15] X3D: Expanding Architectures for Efficient Video Recognition
    Feichtenhofer, Christoph
    [J]. 2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, : 200 - 210
  • [16] THE DESIGN AND USE OF STEERABLE FILTERS
    FREEMAN, WT
    ADELSON, EH
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1991, 13 (09) : 891 - 906
  • [17] Gao XY, 2019, CONF REC ASILOMAR C, P930, DOI [10.1109/ieeeconf44664.2019.9048939, 10.1109/IEEECONF44664.2019.9048939]
  • [18] Channel Pruning for Accelerating Very Deep Neural Networks
    He, Yihui
    Zhang, Xiangyu
    Sun, Jian
    [J]. 2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, : 1398 - 1406
  • [19] Hinton G., 2014, DISTILLING KNOWLEDGE, P1, DOI [10.48550/arXiv.1503.02531, DOI 10.48550/ARXIV.1503.02531]
  • [20] Hu P., 2020, CVPR, P8818