A Differentiable Entropy Model for Learned Image Compression

被引:4
作者
Presta, Alberto [1 ]
Fiandrotti, Attilio [1 ]
Tartaglione, Enzo [2 ]
Grangetto, Marco [1 ]
机构
[1] Univ Turin, Comp Sci Dept, Turin, Italy
[2] Telecom Paris, Inst Polytech Paris, LTCI, Palaiseau, France
来源
IMAGE ANALYSIS AND PROCESSING, ICIAP 2023, PT I | 2023年 / 14233卷
关键词
Learned image coding; entropy estimation; differentiable entropy; autoencoder; image compression;
D O I
10.1007/978-3-031-43148-7_28
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In an end-to-end learned image compression framework, an encoder projects the image on a low-dimensional, quantized, latent space while a decoder recovers the original image. The encoder and decoder are jointly trained with standard gradient backpropagation to minimize a rate-distortion (RD) cost function accounting for both distortions between the original and reconstructed image and the quantized latent space rate. State-of-the-art methods rely on an auxiliary neural network to estimate the rate R of the latent space. We propose a non-parametric entropy model that estimates the statistical frequencies of the quantized latent space during training. The proposed model is differentiable, so it can be plugged into the cost function to be minimized as a rate proxy and can be adapted to a given context without retraining. Our experiments show comparable performance with a learned rate estimator and better performance when is adapted over a temporal context.
引用
收藏
页码:328 / 339
页数:12
相关论文
共 18 条
[1]  
[Anonymous], 1999, Kodak lossless true color image suite
[2]  
Balle J., 2018, INT C LEARNING REPRE
[3]  
Balle Johannes., 2017, P 5 INT C LEARN REPR
[4]  
Begaint Jean., 2020, arXiv
[5]  
Bjontegaard G., 2001, CALCULATION AVERAGE
[6]   Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules [J].
Cheng, Zhengxue ;
Sun, Heming ;
Takeuchi, Masaru ;
Katto, Jiro .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :7936-7945
[7]   Theoretical foundations of transform coding [J].
Goyal, VK .
IEEE SIGNAL PROCESSING MAGAZINE, 2001, 18 (05) :9-21
[8]  
Joint Video Exploration Team (JVET) of ITU-T SG16 WP3 andISO/IEC JTC1/SC29/WG11, 2017, 7 M TOR IT
[9]  
Lee J., 2019, P 7 INT C LEARN REPR
[10]   DPICT: Deep Progressive Image Compression Using Trit-Planes [J].
Lee, Jae-Han ;
Jeon, Seungmin ;
Choi, Kwang Pyo ;
Park, Youngo ;
Kim, Chang-Su .
2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, :16092-16101