Contraction of Dynamically Masked Deep Neural Networks for Efficient Video Processing

被引：2

作者：

Rueckauer, Bodo ^{[1
,2
,3
]}

Liu, Shih-Chii ^{[1
,2
]}

机构：

[1] Univ Zurich, Inst Neuroinformat, CH-8057 Zurich, Switzerland

[2] Swiss Fed Inst Technol, CH-8057 Zurich, Switzerland

[3] Radboud Univ Nijmegen, Donders Inst Brain Cognit & Behav, NL-6525 XZ Nijmegen, Netherlands

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2022年 / 32卷 / 02期

关键词：

Deep neural networks; network compression; Taylor approximation; masking;

D O I：

10.1109/TCSVT.2021.3066241

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Sequential data such as video are characterized by spatio-temporal redundancies. As of yet, few deep learning algorithms exploit them to decrease the often massive cost during inference. This work leverages correlations in video data to reduce the size and run-time cost of deep neural networks. Drawing upon the simplicity of the typically used ReLU activation function, we replace this function by dynamically updating masks. The resulting network is a simple chain of matrix multiplications and bias additions, which can be contracted into a single weight matrix and bias vector. Inference then reduces to an affine transformation of the input sample with these contracted parameters. We show that the method is akin to approximating the neural network with a first-order Taylor expansion around a dynamically updating reference point. For triggering these updates, one static and three data-driven mechanisms are analyzed. We evaluate the proposed algorithm on a range of tasks, including pose estimation on surveillance data, road detection on KITTI driving scenes, object detection on ImageNet videos, as well as denoising MNIST digits, and obtain compression rates up to 3.6x.

引用

页码：621 / 633

页数：13

共 50 条

[31] Deep generative neural networks for spectral image processing
Mishra, Puneet
ANALYTICA CHIMICA ACTA, 2022, 1191
[32] Sensory processing and categorization in cortical and deep neural networks
Pinotsis, Dimitris A.
Siegel, Markus
Miller, Earl K.
NEUROIMAGE, 2019, 202
[33] SYNAPTIC DEPRESSION IN DEEP NEURAL NETWORKS FOR SPEECH PROCESSING
Zhang, Wenhao
Li, Hanyu
Yang, Minda
Mesgarani, Nima
2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5865 - 5869
[34] Video and Image Processing with Self-Organizing Neural Networks
Garcia-Rodriguez, Jose
Dominguez, Enrique
Angelopoulou, Anastassia
Psarrou, Alexandra
Jose Mora-Gimeno, Francisco
Orts, Sergio
Manuel Garcia-Chamizo, Juan
ADVANCES IN COMPUTATIONAL INTELLIGENCE, IWANN 2011, PT II, 2011, 6692 : 98 - 104
[35] Efficient and robust bitstream processing in binarised neural networks
Aygun, Sercan
Gunes, Ece Olcay
De Vleeschouwer, Christophe
Electronics Letters, 2021, 57 (05): : 219 - 222
[36] Efficient and robust bitstream processing in binarised neural networks
Aygun, Sercan
Gunes, Ece Olcay
De Vleeschouwer, Christophe
ELECTRONICS LETTERS, 2021, 57 (05) : 219 - 222
[37] Video Saliency Detection Using Deep Convolutional Neural Networks
Zhou, Xiaofei
Liu, Zhi
Gong, Chen
Li, Gongyang
Huang, Mengke
PATTERN RECOGNITION AND COMPUTER VISION, PT II, 2018, 11257 : 308 - 319
[38] DEEP NEURAL NETWORKS FOR NO-REFERENCE VIDEO QUALITY ASSESSMENT
You, Junyong
Korhonen, Jari
2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 2349 - 2353
[39] Multilayer and Multimodal Fusion of Deep Neural Networks for Video Classification
Yang, Xiaodong
Molchanov, Pavlo
Kautz, Jan
MM'16: PROCEEDINGS OF THE 2016 ACM MULTIMEDIA CONFERENCE, 2016, : 978 - 987
[40] Motion vectors and deep neural networks for video camera traps
Riechmann, Miklas
Gardiner, Ross
Waddington, Kai
Rueger, Ryan
Leymarie, Frederic Fol
Rueger, Stefan
ECOLOGICAL INFORMATICS, 2022, 69

← 1 2 3 4 5 →