Deep Learning in Latent Space for Video Prediction and Compression

被引：45

作者：

Liu, Bowen ^{[1
]}

Chen, Yu ^{[1
]}

Liu, Shiyu ^{[1
]}

Kim, Hun-Seok ^{[1
]}

机构：

[1] Univ Michigan, Ann Arbor, MI 48109 USA

来源：

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年

关键词：

EVENT DETECTION;

D O I：

10.1109/CVPR46437.2021.00076

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Learning-based video compression has achieved substantial progress during recent years. The most influential approaches adopt deep neural networks (DNNs) to remove spatial and temporal redundancies by finding the appropriate lower-dimensional representations of frames in the video. We propose a novel DNN based framework that predicts and compresses video sequences in the latent vector space. The proposed method first learns the efficient lower-dimensional latent space representation of each video frame and then performs inter-frame prediction in that latent domain. The proposed latent domain compression of individual frames is obtained by a deep autoencoder trained with a generative adversarial network (GAN). To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame. We demonstrate our method with two applications; video compression and abnormal event detection that share the identical latent frame prediction network. The proposed method exhibits superior or competitive performance compared to the state-of-the-art algorithms specifically designed for either video compression or anomaly detection.(1)

引用

页码：701 / 710

页数：10

共 50 条

[1] Robust real-time unusual event detection using multiple fixed-location monitors
Adam, Amit
Rivlin, Ehud
Shimshoni, Ilan
Reinitz, David
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (03) : 555 - 560
[2] Generative Adversarial Networks for Extreme Learned Image Compression
Agustsson, Eirikur
Tschannen, Michael
Mentzer, Fabian
Timofte, Radu
Van Gool, Luc
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 221 - 231
[3] [Anonymous], 2010, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2010.5539872
[4] [Anonymous], 2016, ADV NEURAL INFORM PR, DOI DOI 10.48550/ARXIV.1605.07157
[5] Babaeizadeh Mohammad, 2018, INT C LEARN REPR
[6] Baig M.H., 2017, Advances in Neural Information Processing Systems, P1246
[7] Balle J., 2020, P C COMP VIS PATT RE, P8500
[8] Balle J., 2018, 6 INT C LEARNING REP, P23
[9] Balle J., 2017, INT C LEARN REPR ICL, P1
[10] Bellard Fabrice., BPG Image Fromat

← 1 2 3 4 5 →