NVTC: Nonlinear Vector Transform Coding

被引:9
作者
Feng, Runsen [1 ]
Guo, Zongyu [1 ]
Li, Weiping [1 ]
Chen, Zhibo [1 ]
机构
[1] Univ Sci & Technol China, Hefei, Peoples R China
来源
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, CVPR | 2023年
关键词
SPHERE PACKING PROBLEM; QUANTIZATION;
D O I
10.1109/CVPR52729.2023.00591
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In theory, vector quantization (VQ) is always better than scalar quantization (SQ) in terms of rate-distortion (RD) performance [33]. Recent state-of-the-art methods for neural image compression are mainly based on nonlinear transform coding (NTC) with uniform scalar quantization, overlooking the benefits of VQ due to its exponentially increased complexity. In this paper, we first investigate on some toy sources, demonstrating that even if modern neural networks considerably enhance the compression performance of SQ with nonlinear transform, there is still an insurmountable chasm between SQ and VQ. Therefore, revolving around VQ, we propose a novel framework for neural image compression named Nonlinear Vector Transform Coding (NVTC). NVTC solves the critical complexity issue of VQ through (1) a multi-stage quantization strategy and (2) nonlinear vector transforms. In addition, we apply entropy-constrained VQ in latent space to adaptively determine the quantization boundaries for joint rate-distortion optimization, which improves the performance both theoretically and experimentally. Compared to previous NTC approaches, NVTC demonstrates superior rate-distortion performance, faster decoding speed, and smaller model size. Our code is available at https://github.com/USTC-IMCL/NVTC.
引用
收藏
页码:6101 / 6110
页数:10
相关论文
共 45 条
[1]  
Agustsson E., 2020, P ADV NEUR INF PROC, V33, P12367
[2]  
Agustsson E, 2017, ADV NEUR IN, V30
[3]  
[Anonymous], 2021, CLIC2021 CHALL LEARN
[4]  
Arthur D., 2006, TECHNICAL REPORT
[5]  
Balle J., 2020, IEEE J. of Selected Topics in Signal Processing, V15, P339
[6]  
Balle J., 2018, INT C LEARNING REPRE
[7]  
Balle Johannes., 2017, P 5 INT C LEARN REPR
[8]  
Begaint J., 2020, Compressai: a pytorch library and evaluation platform for end-to-end compression research
[9]  
Biing-Hwang Juang, 1982, Proceedings of ICASSP 82. IEEE International Conference on Acoustics, Speech and Signal Processing, P597
[10]  
Bjontegaard G., 2001, VCEG-M33document