QNet: An Adaptive Quantization Table Generator Based on Convolutional Neural Network

被引：6

作者：

Yan, Xiao ^{[1
]}

Fan, Yibo ^{[1
]}

Chen, Kewei ^{[1
]}

Yu, Xulin ^{[2
]}

Zeng, Xiaoyang ^{[1
]}

机构：

[1] Fudan Univ, State Key Lab ASIC & Syst, Shanghai 200433, Peoples R China

[2] Alibaba Grp, Hangzhou 311121, Peoples R China

来源：

IEEE TRANSACTIONS ON IMAGE PROCESSING | 2020年 / 29卷

基金：

中国国家自然科学基金;

关键词：

Quantization (signal); Image coding; Optimization; Transform coding; Rate-distortion; Discrete cosine transforms; Standards; Convolutional neural network (CNN); image compression; JPEG; quantization table; peak signal-to-noise ratio (PSNR); structural similarity index measurement (SSIM); ALGORITHM;

D O I：

10.1109/TIP.2020.3030126

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The JPEG is one of the most widely used lossy image-compression standards, whose compression performance depends largely on a quantization table. In this work, we utilize a Convolutional Neural Network (CNN) to generate an image-adaptive quantization table in a standard-compliant way. We first build an image set containing more than 10,000 images and generate their optimal quantization tables through a classical genetic algorithm, and then propose a method that can efficiently extract and fuse the frequency and spatial domain information of each image to train a regression network to directly generate adaptive quantization tables. In addition, we extract several representative quantization tables from the dataset and train a classification network to indicate the optimal one for each image, which further improves compression performance and computational efficiency. Tests on diverse images show that the proposed method clearly outperforms the state-of-the-art method. Compared with the standard table at the compression rate of 1.0 bpp, the regression and classification network provide average Peak Signal-to-Noise Ratio (PSNR) gains of nearly 1.2 and 1.4 dB. For the experiment under Structural Similarity Index Measurement (SSIM), the improvements are 0.4% and 0.54%, respectively. The proposed method also has competitive computational efficiency, as the regression and classification network only take 15 and 6.25 milliseconds, respectively, to process a 768x512 image on a single CPU core at 3.20 GHz.

引用

页码：9654 / 9664

页数：11

共 27 条

[1] NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study
Agustsson, Eirikur
Timofte, Radu
[J]. 2017 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW), 2017, : 1122 - 1131
[2] ALAKUIJALA J, 2017, ARXIV170304421
[3] [Anonymous], 2017, P ICLR
[4] Knowledge-based genetic algorithm approach to quantization table generation for the JPEG baseline algorithm
Balasubramanian, Vinoth Kumar
Manavalan, Karpagam
[J]. TURKISH JOURNAL OF ELECTRICAL ENGINEERING AND COMPUTER SCIENCES, 2016, 24 (03) : 1615 - 1635
[5] Balle J., 2017, INT C LEARNING REPRE
[6] Joint thresholding and quantizer selection for transform image coding: Entropy-constrained analysis and applications to baseline JPEG
Crouse, M
Ramchandran, K
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 1997, 6 (02) : 285 - 297
[7] Simulated Annealing for JPEG Quantization
Hopkins, Max
Mitzenmacher, Michael
Wagner-Carena, Sebastian
[J]. 2018 DATA COMPRESSION CONFERENCE (DCC 2018), 2018, : 412 - 412
[8] HUNG AC, 1991, INT CONF ACOUST SPEE, P2621, DOI 10.1109/ICASSP.1991.150939
[9] Jiang YB, 2011, CONF REC ASILOMAR C, P225, DOI 10.1109/ACSSC.2011.6189990
[10] Kingma D.P., 2015, P INT C LEARN REPR I, P1

← 1 2 3 →