Deep Learning in Latent Space for Video Prediction and Compression

被引:45
作者
Liu, Bowen [1 ]
Chen, Yu [1 ]
Liu, Shiyu [1 ]
Kim, Hun-Seok [1 ]
机构
[1] Univ Michigan, Ann Arbor, MI 48109 USA
来源
2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR 2021 | 2021年
关键词
EVENT DETECTION;
D O I
10.1109/CVPR46437.2021.00076
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Learning-based video compression has achieved substantial progress during recent years. The most influential approaches adopt deep neural networks (DNNs) to remove spatial and temporal redundancies by finding the appropriate lower-dimensional representations of frames in the video. We propose a novel DNN based framework that predicts and compresses video sequences in the latent vector space. The proposed method first learns the efficient lower-dimensional latent space representation of each video frame and then performs inter-frame prediction in that latent domain. The proposed latent domain compression of individual frames is obtained by a deep autoencoder trained with a generative adversarial network (GAN). To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame. We demonstrate our method with two applications; video compression and abnormal event detection that share the identical latent frame prediction network. The proposed method exhibits superior or competitive performance compared to the state-of-the-art algorithms specifically designed for either video compression or anomaly detection.(1)
引用
收藏
页码:701 / 710
页数:10
相关论文
共 50 条
  • [1] Robust real-time unusual event detection using multiple fixed-location monitors
    Adam, Amit
    Rivlin, Ehud
    Shimshoni, Ilan
    Reinitz, David
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2008, 30 (03) : 555 - 560
  • [2] Generative Adversarial Networks for Extreme Learned Image Compression
    Agustsson, Eirikur
    Tschannen, Michael
    Mentzer, Fabian
    Timofte, Radu
    Van Gool, Luc
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 221 - 231
  • [3] [Anonymous], 2010, PROC CVPR IEEE, DOI DOI 10.1109/CVPR.2010.5539872
  • [4] [Anonymous], 2016, ADV NEURAL INFORM PR, DOI DOI 10.48550/ARXIV.1605.07157
  • [5] Babaeizadeh Mohammad, 2018, INT C LEARN REPR
  • [6] Baig M.H., 2017, Advances in Neural Information Processing Systems, P1246
  • [7] Balle J., 2020, P C COMP VIS PATT RE, P8500
  • [8] Balle J., 2018, 6 INT C LEARNING REP, P23
  • [9] Balle J., 2017, INT C LEARN REPR ICL, P1
  • [10] Bellard Fabrice., BPG Image Fromat