Insights From Generative Modeling for Neural Video Compression

被引:0
|
作者
Yang, Ruihan [1 ]
Yang, Yibo [1 ]
Marino, Joseph [2 ]
Mandt, Stephan [1 ]
机构
[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA
[2] CALTECH, DeepMind, Pasadena, CA 91125 USA
基金
美国国家科学基金会;
关键词
Transforms; Video compression; Data models; Image coding; Predictive coding; Streaming media; Rate-distortion; Autoregressive models; generative models; normalizing flow; variational inference; video compression;
D O I
10.1109/TPAMI.2023.3260684
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images. In a similar spirit, we view recently proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling. We present these codecs as instances of a generalized stochastic temporal autoregressive transform, and propose new avenues for further improvements inspired by normalizing flows and structured priors. We propose several architectures that yield state-of-the-art video compression performance on high-resolution video and discuss their tradeoffs and ablations. In particular, we propose (i) improved temporal autoregressive transforms, (ii) improved entropy models with structured and temporal dependencies, and (iii) variable bitrate versions of our algorithms. Since our improvements are compatible with a large class of existing models, we provide further evidence that the generative modeling viewpoint can advance the neural video coding field.
引用
收藏
页码:9908 / 9921
页数:14
相关论文
共 50 条
  • [1] End-to-End Neural Video Compression: A Review
    Gomes, Jiovana S.
    Grellert, Mateus
    Ramos, Fabio L. L.
    Bampi, Sergio
    IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS, 2025, 6 : 120 - 134
  • [2] CGVC-T: Contextual Generative Video Compression With Transformers
    Du, Pengli
    Liu, Ying
    Ling, Nam
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2024, 14 (02) : 209 - 223
  • [3] Image and Video Compression With Neural Networks: A Review
    Ma, Siwei
    Zhang, Xinfeng
    Jia, Chuanmin
    Zhao, Zhenghui
    Wang, Shiqi
    Wang, Shanshe
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (06) : 1683 - 1698
  • [4] A GENERATIVE COMPRESSION FRAMEWORK FOR LOW BANDWIDTH VIDEO CONFERENCE
    Feng, Dahu
    Huang, Yan
    Zhang, Yiwei
    Ling, Jun
    Tang, Anni
    Song, Li
    2021 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW), 2021,
  • [5] Neural networks for image and video compression
    Gorodilov, Artem
    Gavrilov, Dmitriy
    Schelkunov, Dmitriy
    2018 INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE: APPLICATIONS AND INNOVATIONS (IC-AIAI), 2018, : 37 - 41
  • [6] Edge-Based Video Compression Texture Synthesis Using Generative Adversarial Network
    Zhu, Chen
    Xu, Jun
    Feng, Donghui
    Xie, Rong
    Song, Li
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 7061 - 7076
  • [7] Hierarchical Random Access Coding for Deep Neural Video Compression
    Thang, Nguyen Van
    Bang, Le Van
    IEEE ACCESS, 2023, 11 : 57494 - 57502
  • [8] Innovative Insights: A Review of Deep Learning Methods for Enhanced Video Compression
    Khadir, Mohammad
    Farukh Hashmi, Mohammad
    Kotambkar, Deepali M.
    Gupta, Aditya
    IEEE ACCESS, 2024, 12 : 125706 - 125725
  • [9] VCNPU: An Algorithm-Hardware Co-Optimized Framework for Accelerating Neural Video Compression
    Zhang, Siyu
    Mao, Wendong
    Wang, Zhongfeng
    IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, 2024,
  • [10] Neural networks for image and video compression: A review
    Cramer, C
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1998, 108 (02) : 266 - 282