Contraction of Dynamically Masked Deep Neural Networks for Efficient Video Processing

被引:2
|
作者
Rueckauer, Bodo [1 ,2 ,3 ]
Liu, Shih-Chii [1 ,2 ]
机构
[1] Univ Zurich, Inst Neuroinformat, CH-8057 Zurich, Switzerland
[2] Swiss Fed Inst Technol, CH-8057 Zurich, Switzerland
[3] Radboud Univ Nijmegen, Donders Inst Brain Cognit & Behav, NL-6525 XZ Nijmegen, Netherlands
关键词
Deep neural networks; network compression; Taylor approximation; masking;
D O I
10.1109/TCSVT.2021.3066241
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Sequential data such as video are characterized by spatio-temporal redundancies. As of yet, few deep learning algorithms exploit them to decrease the often massive cost during inference. This work leverages correlations in video data to reduce the size and run-time cost of deep neural networks. Drawing upon the simplicity of the typically used ReLU activation function, we replace this function by dynamically updating masks. The resulting network is a simple chain of matrix multiplications and bias additions, which can be contracted into a single weight matrix and bias vector. Inference then reduces to an affine transformation of the input sample with these contracted parameters. We show that the method is akin to approximating the neural network with a first-order Taylor expansion around a dynamically updating reference point. For triggering these updates, one static and three data-driven mechanisms are analyzed. We evaluate the proposed algorithm on a range of tasks, including pose estimation on surveillance data, road detection on KITTI driving scenes, object detection on ImageNet videos, as well as denoising MNIST digits, and obtain compression rates up to 3.6x.
引用
收藏
页码:621 / 633
页数:13
相关论文
共 50 条
  • [31] Deep generative neural networks for spectral image processing
    Mishra, Puneet
    ANALYTICA CHIMICA ACTA, 2022, 1191
  • [32] Sensory processing and categorization in cortical and deep neural networks
    Pinotsis, Dimitris A.
    Siegel, Markus
    Miller, Earl K.
    NEUROIMAGE, 2019, 202
  • [33] SYNAPTIC DEPRESSION IN DEEP NEURAL NETWORKS FOR SPEECH PROCESSING
    Zhang, Wenhao
    Li, Hanyu
    Yang, Minda
    Mesgarani, Nima
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5865 - 5869
  • [34] Video and Image Processing with Self-Organizing Neural Networks
    Garcia-Rodriguez, Jose
    Dominguez, Enrique
    Angelopoulou, Anastassia
    Psarrou, Alexandra
    Jose Mora-Gimeno, Francisco
    Orts, Sergio
    Manuel Garcia-Chamizo, Juan
    ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2011, PT II, 2011, 6692 : 98 - 104
  • [35] Efficient and robust bitstream processing in binarised neural networks
    Aygun, Sercan
    Gunes, Ece Olcay
    De Vleeschouwer, Christophe
    Electronics Letters, 2021, 57 (05): : 219 - 222
  • [36] Efficient and robust bitstream processing in binarised neural networks
    Aygun, Sercan
    Gunes, Ece Olcay
    De Vleeschouwer, Christophe
    ELECTRONICS LETTERS, 2021, 57 (05) : 219 - 222
  • [37] Video Saliency Detection Using Deep Convolutional Neural Networks
    Zhou, Xiaofei
    Liu, Zhi
    Gong, Chen
    Li, Gongyang
    Huang, Mengke
    PATTERN RECOGNITION AND COMPUTER VISION, PT II, 2018, 11257 : 308 - 319
  • [38] DEEP NEURAL NETWORKS FOR NO-REFERENCE VIDEO QUALITY ASSESSMENT
    You, Junyong
    Korhonen, Jari
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2349 - 2353
  • [39] Multilayer and Multimodal Fusion of Deep Neural Networks for Video Classification
    Yang, Xiaodong
    Molchanov, Pavlo
    Kautz, Jan
    MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, : 978 - 987
  • [40] Motion vectors and deep neural networks for video camera traps
    Riechmann, Miklas
    Gardiner, Ross
    Waddington, Kai
    Rueger, Ryan
    Leymarie, Frederic Fol
    Rueger, Stefan
    ECOLOGICAL INFORMATICS, 2022, 69