Rate-Distortion Optimized Encoding for Deep Image Compression

被引：4

作者：

Schafer, Michael ^{[1
]}

Pientka, Sophie ^{[1
]}

Pfaff, Jonathan ^{[1
,2
]}

Schwarz, Heiko ^{[1
,3
]}

Marpe, Detlev ^{[1
]}

Wiegand, Thomas ^{[1
,2
,4
]}

机构：

[1] Heinrich Hertz Inst Nachrichtentech Berlin GmbH, Fraunhofer Inst Telecommun, Video Commun & Applicat Dept, D-10587 Berlin, Germany

[2] Heinrich Hertz Inst Nachrichtentech Berlin GmbH, Fraunhofer Inst Telecommun, D-10587 Berlin, Germany

[3] Free Univ Berlin, Dept Math & Comp Sci, D-14195 Berlin, Germany

[4] Berlin Inst Technol, Dept Elect Engn & Comp Sci, D-10623 Berlin, Germany

来源：

IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS | 2021年 / 2卷

关键词：

Video coding; Image coding; Vector quantization; Nonlinear distortion; Bit rate; Rate-distortion; Signal processing algorithms; Deep image compression; variational auto-encoders; rate-distortion optimized encoding; non-linear transform coding; VIDEO; EFFICIENCY;

D O I：

10.1109/OJCAS.2021.3124995

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Deep-learned variational auto-encoders (VAE) have shown remarkable capabilities for lossy image compression. These neural networks typically employ non-linear convolutional layers for finding a compressible representation of the input image. Advanced techniques such as vector quantization, context-adaptive arithmetic coding and variable-rate compression have been implemented in these auto-encoders. Notably, these networks rely on an end-to-end approach, which fundamentally differs from hybrid, block-based video coding systems. Therefore, signal-dependent encoder optimizations have not been thoroughly investigated for VAEs yet. However, rate-distortion optimized encoding heavily determines the compression performance of state-of-the-art video codecs. Designing such optimizations for non-linear, multi-layered networks requires to understand the relationship between the quantization, the bit allocation of the features and the distortion. Therefore, this paper examines the rate-distortion performance of a variable-rate VAE. In particular, one demonstrates that the trained encoder network typically finds features with a near-optimal bit allocation across the channels. Furthermore, one approximates the relationship between distortion and quantization by a higher-order polynomial, whose coefficients can be robustly estimated. Based on these considerations, the authors investigate an encoding algorithm for the Lagrange optimization, which significantly improves the coding efficiency.

引用

页码：633 / 647

页数：15

共 39 条

[1]

Agustsson E, 2017, ADV NEUR IN, V30

[2]

Akbari M., 2002, ARXIV200210032, V2020

[3]

[Anonymous], 2013, ITU-T Rec. H.265

[4]

[Anonymous], 2020, VERS VID COD

[5]

[Anonymous], 2021, KODAK IMAGE DATASET

[6] Nonlinear Transform Coding [J].

Balle, Johannes ;

Chou, Philip A. ;

Minnen, David ;

Singh, Saurabh ;

Johnston, Nick ;

Agustsson, Eirikur ;

Hwang, Sung Jin ;

Toderici, George .

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2021, 15 (02) :339-353

[7] End-to-end optimization of nonlinear transform codes for perceptual quality [J].

Balle, Johannes ;

Laparra, Valero ;

Simoncelli, Eero P. .

2016 PICTURE CODING SYMPOSIUM (PCS), 2016,

[8]

Balle Johannes, 2016, P INT C LEARN REPR

[9]

Balle Johannes, 2018, arXiv preprint arXiv:1802.01436

[10]

Balle Johannes, 2017, INT C LEARN REPR

← 1 2 3 4 →