Versatile Video Coding-Based Coding Tree Unit Level Image Compression With Dual Quantization Parameters for Hybrid Vision

被引:3
作者
Kim, Shin [1 ]
Lee, Yegi [1 ]
Yoon, Kyoungro [1 ]
机构
[1] Konkuk Univ, Dept Comp Sci & Engn, Seoul 05029, South Korea
关键词
Image coding; Machine vision; Codecs; Bit rate; Image reconstruction; Video coding; Object segmentation; Video coding for machines; machine vision; hybrid vision; versatile video coding; MACHINE;
D O I
10.1109/ACCESS.2023.3263207
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image analysis based on machine vision is hugely manipulated in the smart industry. Good-quality images are required for outstanding machine analysis results, but handling high-definition images could be problematic in a constrained environment such as a low-bandwidth network or low-capacity storage. Lowering the image resolution might be a straightforward solution for reducing image data, but it would cause much information loss, leading to the deterioration of machine vision. Moreover, human supervision could be necessary for a contingency that machine vision cannot control. Therefore, an innovative image compression method considering machine and human vision is required; more compression efficiency than the state-of-the-art codec, praiseworthy machine vision performance, and human-recognizable quality. In this paper, we propose Versatile video coding(VVC) based image compression for hybrid vision, i.e., machine vision and human vision. Our work provides a coding tree unit(CTU) level image compression with dual quantization parameters (QPs) according to the quantization parameter map and the saliency extracted by the object detection network; in the salient region, the proposed method maintains high quality with low QP but degrades the quality with high QP in the non-salient region. Compared with VVC, the proposed compression method achieves a bitrate reduction of up to 25.5% in machine vision tasks, proving more compression efficiency and still admirable machine vision performance. From the perspective of human vision, the proposed method provides human-perceptible image quality, preserving acceptable objective quality values.
引用
收藏
页码:34498 / 34509
页数:12
相关论文
共 45 条
  • [1] HUMAN-MACHINE COLLABORATIVE VIDEO CODING THROUGH CUBOIDAL PARTITIONING
    Ahmmed, Ashek
    Paul, Manoranjan
    Murshed, Manzur
    Taubman, David
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2021, : 2074 - 2078
  • [2] [Anonymous], 2021, ISO/ IEC 23090-3
  • [3] Bjontegaard G., 2001, document ITU-T VCEG-M33
  • [4] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
    Chen, Liang-Chieh
    Papandreou, George
    Kokkinos, Iasonas
    Murphy, Kevin
    Yuille, Alan L.
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
  • [5] BING: Binarized Normed Gradients for Objectness Estimation at 300fps
    Cheng, Ming-Ming
    Zhang, Ziming
    Lin, Wen-Yan
    Torr, Philip
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3286 - 3293
  • [6] SALIENCY-DRIVEN VERSATILE VIDEO CODING FOR NEURAL OBJECT DETECTION
    Fischer, Kristian
    Fleckenstein, Felix
    Herglotz, Christian
    Kaup, Andre
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 1505 - 1509
  • [7] Gao W., 2021, DOCUMENT ISOIEC JTC
  • [8] AN OPEN DATASET FOR VIDEO CODING FOR MACHINES STANDARDIZATION
    Gao, Wen
    Xu, Xiaozhong
    Qin, Matthew
    Liu, Shan
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 4008 - 4012
  • [9] Hassoun M., 1995, Fundamentals of artificial neural networks
  • [10] He K., 2017, P IEEE INT C COMP VI, P2961