Versatile Video Coding-Based Coding Tree Unit Level Image Compression With Dual Quantization Parameters for Hybrid Vision

被引：3

作者：

Kim, Shin ^{[1
]}

Lee, Yegi ^{[1
]}

Yoon, Kyoungro ^{[1
]}

机构：

[1] Konkuk Univ, Dept Comp Sci & Engn, Seoul 05029, South Korea

来源：

IEEE ACCESS | 2023年 / 11卷

关键词：

Image coding; Machine vision; Codecs; Bit rate; Image reconstruction; Video coding; Object segmentation; Video coding for machines; machine vision; hybrid vision; versatile video coding; MACHINE;

D O I：

10.1109/ACCESS.2023.3263207

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Image analysis based on machine vision is hugely manipulated in the smart industry. Good-quality images are required for outstanding machine analysis results, but handling high-definition images could be problematic in a constrained environment such as a low-bandwidth network or low-capacity storage. Lowering the image resolution might be a straightforward solution for reducing image data, but it would cause much information loss, leading to the deterioration of machine vision. Moreover, human supervision could be necessary for a contingency that machine vision cannot control. Therefore, an innovative image compression method considering machine and human vision is required; more compression efficiency than the state-of-the-art codec, praiseworthy machine vision performance, and human-recognizable quality. In this paper, we propose Versatile video coding(VVC) based image compression for hybrid vision, i.e., machine vision and human vision. Our work provides a coding tree unit(CTU) level image compression with dual quantization parameters (QPs) according to the quantization parameter map and the saliency extracted by the object detection network; in the salient region, the proposed method maintains high quality with low QP but degrades the quality with high QP in the non-salient region. Compared with VVC, the proposed compression method achieves a bitrate reduction of up to 25.5% in machine vision tasks, proving more compression efficiency and still admirable machine vision performance. From the perspective of human vision, the proposed method provides human-perceptible image quality, preserving acceptable objective quality values.

引用

页码：34498 / 34509

页数：12

共 45 条

[1] HUMAN-MACHINE COLLABORATIVE VIDEO CODING THROUGH CUBOIDAL PARTITIONING
Ahmmed, Ashek
Paul, Manoranjan
Murshed, Manzur
Taubman, David
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2074 - 2078
[2] [Anonymous], 2021, ISO/ IEC 23090-3
[3] Bjontegaard G., 2001, document ITU-T VCEG-M33
[4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[5] BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Cheng, Ming-Ming
Zhang, Ziming
Lin, Wen-Yan
Torr, Philip
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3286 - 3293
[6] SALIENCY-DRIVEN VERSATILE VIDEO CODING FOR NEURAL OBJECT DETECTION
Fischer, Kristian
Fleckenstein, Felix
Herglotz, Christian
Kaup, Andre
[J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 1505 - 1509
[7] Gao W., 2021, DOCUMENT ISOIEC JTC
[8] AN OPEN DATASET FOR VIDEO CODING FOR MACHINES STANDARDIZATION
Gao, Wen
Xu, Xiaozhong
Qin, Matthew
Liu, Shan
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 4008 - 4012
[9] Hassoun M., 1995, Fundamentals of artificial neural networks
[10] He K., 2017, P IEEE INT C COMP VI, P2961

← 1 2 3 4 5 →