Insights From Generative Modeling for Neural Video Compression

被引:0
|
作者
Yang, Ruihan [1 ]
Yang, Yibo [1 ]
Marino, Joseph [2 ]
Mandt, Stephan [1 ]
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[2] CALTECH, DeepMind, Pasadena, CA 91125 USA
基金
美国国家科学基金会;
关键词
Transforms; Video compression; Data models; Image coding; Predictive coding; Streaming media; Rate-distortion; Autoregressive models; generative models; normalizing flow; variational inference; video compression;
D O I
10.1109/TPAMI.2023.3260684
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images. In a similar spirit, we view recently proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling. We present these codecs as instances of a generalized stochastic temporal autoregressive transform, and propose new avenues for further improvements inspired by normalizing flows and structured priors. We propose several architectures that yield state-of-the-art video compression performance on high-resolution video and discuss their tradeoffs and ablations. In particular, we propose (i) improved temporal autoregressive transforms, (ii) improved entropy models with structured and temporal dependencies, and (iii) variable bitrate versions of our algorithms. Since our improvements are compatible with a large class of existing models, we provide further evidence that the generative modeling viewpoint can advance the neural video coding field.
引用
收藏
页码:9908 / 9921
页数:14
相关论文
共 50 条
  • [31] Neural Video Compression Based on SURF Scene Change Detection Algorithm
    Grycuk, Rafal
    Knop, Michal
    IMAGE PROCESSING AND COMMUNICATIONS CHALLENGES 7, 2016, 389 : 105 - 112
  • [32] MobileCodec: Neural Inter-frame Video Compression on Mobile Devices
    Le, Hoang
    Zhang, Liang
    Said, Amir
    Sautiere, Guillaume
    Yang, Yang
    Shrestha, Pranav
    Yin, Fei
    Pourreza, Reza
    Wiggers, Auke
    PROCEEDINGS OF THE 13TH ACM MULTIMEDIA SYSTEMS CONFERENCE, MMSYS 2022, 2022, : 324 - 330
  • [33] Overview of Research in the field of Video Compression using Deep Neural Networks
    Raz Birman
    Yoram Segal
    Ofer Hadar
    Multimedia Tools and Applications, 2020, 79 : 11699 - 11722
  • [34] Perceptual Quality Assessment of Face Video Compression: A Benchmark and An Effective Method
    Li, Yixuan
    Chen, Bolin
    Chen, Baoliang
    Wang, Meng
    Wang, Shiqi
    Lin, Weisi
    IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8596 - 8608
  • [35] An Untrained Neural Network Prior for Light Field Compression
    Jiang, Xiaoran
    Shi, Jinglei
    Guillemot, Christine
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6922 - 6936
  • [36] Some technologies about video compression
    Hu Yu
    BIOTECHNOLOGY, CHEMICAL AND MATERIALS ENGINEERING, PTS 1-3, 2012, 393-395 : 284 - 287
  • [37] Video compression by computer and its application
    Hu Yu
    TRENDS IN BUILDING MATERIALS RESEARCH, PTS 1 AND 2, 2012, 450-451 : 1293 - 1296
  • [38] DEEP VIDEO COMPRESSION FOR INTERFRAME CODING
    Alexandre, David
    Hang, Hsueh-Ming
    Peng, Wen-Hsiao
    Domanski, Marek
    2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2124 - 2128
  • [39] Neural Network-Based Video Compression Artifact Reduction Using Temporal Correlation and Sparsity Prior Predictions
    Chen, Wei-Gang
    Yu, Runyi
    Wang, Xun
    IEEE ACCESS, 2020, 8 : 162479 - 162490
  • [40] Two-step iterative algorithm to extract generative video parameters from video sequences
    Gupta, R. Naga Yathindra
    Rao, E. Umapathi
    Baskar, A.
    Karthi, R.
    ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL III, PROCEEDINGS, 2007, : 458 - 463