Insights From Generative Modeling for Neural Video Compression

被引：0

作者：

Yang, Ruihan ^{[1
]}

Yang, Yibo ^{[1
]}

Marino, Joseph ^{[2
]}

Mandt, Stephan ^{[1
]}

机构：

[1] Univ Calif Irvine, Dept Comp Sci, Irvine, CA 92697 USA

[2] CALTECH, DeepMind, Pasadena, CA 91125 USA

来源：

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE | 2023年 / 45卷 / 08期

基金：

美国国家科学基金会;

关键词：

Transforms; Video compression; Data models; Image coding; Predictive coding; Streaming media; Rate-distortion; Autoregressive models; generative models; normalizing flow; variational inference; video compression;

D O I：

10.1109/TPAMI.2023.3260684

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

While recent machine learning research has revealed connections between deep generative models such as VAEs and rate-distortion losses used in learned compression, most of this work has focused on images. In a similar spirit, we view recently proposed neural video coding algorithms through the lens of deep autoregressive and latent variable modeling. We present these codecs as instances of a generalized stochastic temporal autoregressive transform, and propose new avenues for further improvements inspired by normalizing flows and structured priors. We propose several architectures that yield state-of-the-art video compression performance on high-resolution video and discuss their tradeoffs and ablations. In particular, we propose (i) improved temporal autoregressive transforms, (ii) improved entropy models with structured and temporal dependencies, and (iii) variable bitrate versions of our algorithms. Since our improvements are compatible with a large class of existing models, we provide further evidence that the generative modeling viewpoint can advance the neural video coding field.

引用

页码：9908 / 9921

页数：14

共 50 条

[31] Neural Video Compression Based on SURF Scene Change Detection Algorithm
Grycuk, Rafal
Knop, Michal
IMAGE PROCESSING AND COMMUNICATIONS CHALLENGES 7, 2016, 389 : 105 - 112
[32] MobileCodec: Neural Inter-frame Video Compression on Mobile Devices
Le, Hoang
Zhang, Liang
Said, Amir
Sautiere, Guillaume
Yang, Yang
Shrestha, Pranav
Yin, Fei
Pourreza, Reza
Wiggers, Auke
PROCEEDINGS OF THE 13TH ACM MULTIMEDIA SYSTEMS CONFERENCE, MMSYS 2022, 2022, : 324 - 330
[33] Overview of Research in the field of Video Compression using Deep Neural Networks
Raz Birman
Yoram Segal
Ofer Hadar
Multimedia Tools and Applications, 2020, 79 : 11699 - 11722
[34] Perceptual Quality Assessment of Face Video Compression: A Benchmark and An Effective Method
Li, Yixuan
Chen, Bolin
Chen, Baoliang
Wang, Meng
Wang, Shiqi
Lin, Weisi
IEEE TRANSACTIONS ON MULTIMEDIA, 2024, 26 : 8596 - 8608
[35] An Untrained Neural Network Prior for Light Field Compression
Jiang, Xiaoran
Shi, Jinglei
Guillemot, Christine
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 6922 - 6936
[36] Some technologies about video compression
Hu Yu
BIOTECHNOLOGY, CHEMICAL AND MATERIALS ENGINEERING, PTS 1-3, 2012, 393-395 : 284 - 287
[37] Video compression by computer and its application
Hu Yu
TRENDS IN BUILDING MATERIALS RESEARCH, PTS 1 AND 2, 2012, 450-451 : 1293 - 1296
[38] DEEP VIDEO COMPRESSION FOR INTERFRAME CODING
Alexandre, David
Hang, Hsueh-Ming
Peng, Wen-Hsiao
Domanski, Marek
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2124 - 2128
[39] Neural Network-Based Video Compression Artifact Reduction Using Temporal Correlation and Sparsity Prior Predictions
Chen, Wei-Gang
Yu, Runyi
Wang, Xun
IEEE ACCESS, 2020, 8 : 162479 - 162490
[40] Two-step iterative algorithm to extract generative video parameters from video sequences
Gupta, R. Naga Yathindra
Rao, E. Umapathi
Baskar, A.
Karthi, R.
ICCIMA 2007: INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND MULTIMEDIA APPLICATIONS, VOL III, PROCEEDINGS, 2007, : 458 - 463

← 1 2 3 4 5 →