Quantization-Based Adaptive Deep Image Compression Using Semantic Information

被引：2

作者：

Lei, Zhongyue ^{[1
]}

Hong, Xuemin ^{[1
]}

Shi, Jianghong ^{[1
]}

Su, Minxian ^{[2
]}

Lin, Chaoheng ^{[3
]}

Xia, Wei ^{[4
]}

机构：

[1] Xiamen Univ, Sch Informat, Xiamen 361005, Peoples R China

[2] Xiamen Satellite Positioning Applicat Co Ltd, Xiamen 361008, Peoples R China

[3] Xiamen Beidou Key Lab Appl Technol, Xiamen 361008, Peoples R China

[4] Fujian Centerm Informat Co Ltd, Fuzhou 350028, Peoples R China

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Deep image compression; semantic importance; adaptive coding; hybrid contexts; CLASSIFICATION; OPTIMIZATION; RECOGNITION; CONTEXT;

D O I：

10.1109/ACCESS.2023.3326718

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Deep image coding (DIC) for hybrid application contexts has recently attracted significant research interest because of its potential to support both human and machine visual tasks. Since the regions of interest (ROI) are different for different application contexts, it is important to design an adaptive image coding mechanism in practical DIC. In this paper, we propose the first quantization-based adaptive DIC framework for hybrid contexts of image reconstruction and classification. This framework can be applied to upgrade existing fixed-rate DIC models into adaptive DIC for hybrid contexts. It consists of two key modules: a semantics-based ROI mask generation module and a module for generating ROI gain and inverse gain matrices. These matrices are used to control the quantization accuracy of different latent vector elements, thereby achieving encoding at different rates while prioritizing the reconstruction quality of the ROI. Moreover, we propose a five-stage training method for the quantization-based adaptive DIC model to optimize the rate-distortion-classification-perception (RDCP) tradeoff. Experiments over a wide rate range show that our method achieves superior RDCP tradeoff performance. Compared to the benchmark scheme BM-CHENG, the proposed algorithm improves the classification accuracy by an average of 15%. The average relative improvements on various metrics, such as natural image quality evaluator (NIQE), learned perceptual image patch similarity (LPIPS), and feature similarity index measure (FSIM), are about 22%, 47%, and 1%, respectively. The proposed algorithm is a promising candidate for fast adaptive coding with low-complexity constraints.

引用

页码：118061 / 118077

页数：17

共 62 条

[1]

Akutsu H., 2019, PROC IEEE C COMPUT V, P4321

[2] The JPEG AI Standard: Providing Efficient Human and Machine Visual Data Consumption [J].

Ascenso, Joao ;

Alshina, Elena ;

Ebrahimi, Touradj .

IEEE MULTIMEDIA, 2023, 30 (01) :100-111

[3]

Ball‚ J, 2018, Arxiv, DOI arXiv:1802.01436

[4] End-to-end optimization of nonlinear transform codes for perceptual quality [J].

Balle, Johannes ;

Laparra, Valero ;

Simoncelli, Eero P. .

2016 PICTURE CODING SYMPOSIUM (PCS), 2016,

[5]

Begaint J., 2020, arXiv

[6]

Bellard F., 2018, BPG Image Format, document Release 0.9.8

[7]

Bjontegaard G., 2001, document ITU-T VCEG ISO/IEC MPEG document VCEGMM33

[8] Slimmable Multi-Task Image Compression for Human and Machine Vision [J].

Cao, Jiangzhong ;

Yao, Ximei ;

Zhang, Huan ;

Jin, Jian ;

Zhang, Yun ;

Ling, Bingo Wing-Kuen .

IEEE ACCESS, 2023, 11 :29946-29958

[9] End-to-end optimized image compression for machines, a study [J].

Chamain, Lahiru D. ;

Racape, Fabien ;

Begaint, Jean ;

Pushparaja, Akshay ;

Feltman, Simon .

2021 DATA COMPRESSION CONFERENCE (DCC 2021), 2021, :163-172

[10] Grad-CAM plus plus : Generalized Gradient-based Visual Explanations for Deep Convolutional Networks [J].

Chattopadhay, Aditya ;

Sarkar, Anirban ;

Howlader, Prantik ;

Balasubramanian, Vineeth N. .

2018 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2018), 2018, :839-847

← 1 2 3 4 5 6 7 →