Learned Multi-Resolution Variable-Rate Image Compression With Octave-Based Residual Blocks

被引:18
作者
Akbari, Mohammad [1 ]
Liang, Jie [1 ]
Han, Jingning [2 ]
Tu, Chengjie [3 ]
机构
[1] Simon Fraser Univ, Engn Sci, Burnaby, BC V5A 1S6, Canada
[2] Google Inc, WebM Codec Team, Mountain View, CA 94043 USA
[3] Tencent Technol, Shenzhen 518054, Guangdong, Peoples R China
基金
加拿大自然科学与工程研究理事会;
关键词
Image coding; Decoding; Convolutional codes; Transforms; Codecs; Image reconstruction; Linear programming; Deep learning; generalized octave convolutions; image compression; residual coding; variable-rate;
D O I
10.1109/TMM.2021.3068523
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently deep learning-based image compression has shown the potential to outperform traditional codecs. However, most existing methods train multiple networks for multiple bit rates, which increase the implementation complexity. In this paper, we propose a new variable-rate image compression framework, which employs generalized octave convolutions (GoConv) and generalized octave transposed-convolutions (GoTConv) with built-in generalized divisive normalization (GDN) and inverse GDN (IGDN) layers. Novel GoConv- and GoTConv-based residual blocks are also developed in the encoder and decoder networks. Our scheme also uses a stochastic rounding-based scalar quantization. To further improve the performance, we encode the residual between the input and the reconstructed image from the decoder network as an enhancement layer. To enable a single model to operate with different bit rates and to learn multi-rate image features, a new objective function is introduced. Experimental results show that the proposed framework trained with variable-rate objective function outperforms the standard codecs such as H.265/HEVC-based BPG and state-of-the-art learning-based variable-rate methods.
引用
收藏
页码:3013 / 3021
页数:9
相关论文
共 33 条
[1]  
Agustsson E, 2017, ADV NEUR IN, V30
[2]  
Akbari M., 2020, ASS ADV ARTIF INT
[3]   LEARNED VARIABLE-RATE IMAGE COMPRESSION WITH RESIDUAL DIVISIVE NORMALIZATION [J].
Akbari, Mohammad ;
Liang, Jie ;
Han, Jingning ;
Tu, Chengjie .
2020 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), 2020,
[4]  
Akbari M, 2019, INT CONF ACOUST SPEE, P2042, DOI [10.1109/icassp.2019.8683541, 10.1109/ICASSP.2019.8683541]
[5]  
[Anonymous], 2013, CoRR abs/1308.3432
[6]  
[Anonymous], 2018, PROC INT C LEARN REP
[7]  
Balle J., 2017, INT C LEARN REPR ICL, P1
[8]  
Balle J., 2020, IEEE J SEL TOPICS SI
[9]   End-to-end optimization of nonlinear transform codes for perceptual quality [J].
Balle, Johannes ;
Laparra, Valero ;
Simoncelli, Eero P. .
2016 PICTURE CODING SYMPOSIUM (PCS), 2016,
[10]  
Ballé J, 2018, PICT COD SYMP, P248, DOI 10.1109/PCS.2018.8456272