Towards 360∘\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circ }$$\end{document} image compression for machines via modulating pixel significance

被引：0

作者：

Silin Zheng ^{[1
]}

Xuelin Shen ^{[2
]}

Qiudan Zhang ^{[1
]}

Zhuo Chen ^{[3
]}

Wenhan Yang ^{[3
]}

Xu Wang ^{[1
]}

机构：

[1] Shenzhen University,College of Computer Science and Software Engineering

[2] Guangdong Laboratory of Aritificial Intelligence and Digital Economy(SZ),undefined

[3] Peng Cheng Laboratory,undefined

来源：

Multimedia Tools and Applications | 2024年 / 83卷 / 42期

关键词：

Image processing; Image coding for machine; Machine vision; Image compression;

D O I：

10.1007/s11042-024-19139-2

中图分类号：

学科分类号：

摘要：

The rapid growth of computer vision-based applications, including smart cities and autonomous driving, has created a pressing demand for efficient 360∘\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circ }$$\end{document} image compression and computer vision analytics. In most circumstances, 360∘\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circ }$$\end{document} image compression and computer vision face challenges arising from the oversampling inherent in the Equirectangular Projection (ERP). However, these two fields often employ divergent technological approaches. Since image compression aims to reduce redundancy, computer vision analytics attempts to compensate for the semantic distortion caused by the projection process, resulting in a potential conflict between the two objectives. This paper explores a potential route, i.e.360∘\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circ }$$\end{document} Image Coding for Machine (360-ICM), which offers an image processing framework that addresses both object deformation and oversampling redundancy within a unified framework. The key innovation lies in inferring a pixel-wise significant map by jointly considering the requirements of redundancy removal and object deformation offsetting. The significance map would be subsequently fed to a deformation-aware image compression network, guiding the bit allocation process as an external condition. More specifically, we employ a deformation-aware image compression network that is characterized by the Spatial Feature Transform (SFT) layer, which is capable of performing complex affine transformations of high-level semantic features, to be essential in dealing with the deformation. The image compression network and significance inference network are jointly trained under the supervision of a 360∘\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{\circ }$$\end{document} image-specified object detection network, obtaining a compact representation that is both analytics-oriented and deformation-aware. Extensive experimental results have demonstrated the superiority of the proposed method over existing state-of-the-art image codecs in terms of rate-analytics performance.

引用

页码：90271 / 90288

页数：17