ANFIC: Image Compression Using Augmented Normalizing Flows

被引：39

作者：

Ho, Yung-Han ^{[1
]}

Chan, Chih-Chun ^{[1
]}

Peng, Wen-Hsiao ^{[1
]}

Hang, Hsueh-Ming ^{[2
]}

Domanski, Marek ^{[3
]}

机构：

[1] Natl Yang Ming Chiao Tung Univ, Dept Comp Sci, Hsinchu 300, Taiwan

[2] Natl Yang Ming Chiao Tung Univ, Dept Elect Engn, Hsinchu 300, Taiwan

[3] Poznan Univ Tech, Inst Multimedia Telecommun, PL-60965 Poznan, Poland

来源：

IEEE OPEN JOURNAL OF CIRCUITS AND SYSTEMS | 2021年 / 2卷

关键词：

Training; Degradation; Image coding; Codes; Convolution; Stacking; Network architecture; Learning-based image compression; flow-based image compression; augmented normalizing flows; perceptually lossless image compression; variable rate image compression;

D O I：

10.1109/OJCAS.2021.3123201

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper introduces an end-to-end learned image compression system, termed ANFIC, based on Augmented Normalizing Flows (ANF). ANF is a new type of flow model, which stacks multiple variational autoencoders (VAE) for greater model expressiveness. The VAE-based image compression has gone mainstream, showing promising compression performance. Our work presents the first attempt to leverage VAE-based compression in a flow-based framework. ANFIC advances further compression efficiency by stacking and extending hierarchically multiple VAE's. The invertibility of ANF, together with our training strategies, enables ANFIC to support a wide range of quality levels without changing the encoding and decoding networks. Extensive experimental results show that in terms of PSNR-RGB, ANFIC performs comparably to or better than the state-of-the-art learned image compression. Moreover, it performs close to VVC intra coding, from low-rate compression up to perceptually lossless compression. In particular, ANFIC achieves the state-of-the-art performance, when extended with conditional convolution for variable rate compression with a single model. The source code of ANFIC can be found at https://github.com/dororojames/ANFIC.

引用

页码：613 / 626

页数：14

共 25 条

[1]

Asuni N., 2014, STAG SMART TOOLS APP, P63

[2]

Balle Johannes, 2018, arXiv preprint arXiv:1802.01436

[3]

Balle Johannes, 2017, INT C LEARN REPR

[4] End-to-End Learnt Image Compression via Non-Local Attention Optimization and Improved Context Modeling [J].

Chen, Tong ;

Liu, Haojie ;

Ma, Zhan ;

Shen, Qiu ;

Cao, Xun ;

Wang, Yao .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :3179-3191

[5] Learned Image Compression with Discretized Gaussian Mixture Likelihoods and Attention Modules [J].

Cheng, Zhengxue ;

Sun, Heming ;

Takeuchi, Masaru ;

Katto, Jiro .

2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :7936-7945

[6] Variable Rate Deep Image Compression With a Conditional Autoencoder [J].

Choi, Yoojin ;

El-Khamy, Mostafa ;

Lee, Jungwon .

2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :3146-3154

[7]

Dinh L., 2017, 5 INT C LEARN REPR I

[8]

Helminger L., 2021, P INT C LEARN REPR W

[9] End-to-End Learned Image Compression with Augmented Normalizing Flows [J].

Ho, Yung-Han ;

Chan, Chih-Chun ;

Peng, Wen-Hsiao ;

Hang, Hsueh-Ming .

2021 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS, CVPRW 2021, 2021, :1931-1935

[10]

Hoogeboom Emiel, 2019, ADV NEUR IN, V32

← 1 2 3 →