End-to-End Optimized ROI Image Compression

被引:52
作者
Cai, Chunlei [1 ]
Chen, Li [1 ]
Zhang, Xiaoyun [1 ]
Gao, Zhiyong [1 ]
机构
[1] Shanghai Jiao Tong Univ, Inst Image Commun & Network Engn, Dept Elect Engn, Shanghai 200240, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
Region of interest; lossy image compression; object segmentation; ROI coding; rate distortion optimization; convolutional neural network;
D O I
10.1109/TIP.2019.2960869
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Compressing an image with more bits automatically allocated to the region of interest (ROI) than to the background can both protect key information and reduce substantial redundancy. This paper models ROI image compression as an optimization problem of minimizing a weighted sum of the rate of the image and distortion of the ROI. The traditional framework solves this problem by cascading ROI prediction and ROI coding, through which achieving the optimized solution is impossible. To improve coding performance, we propose a novel deep-learning-based unified framework that can achieve rate distortion optimization for ROI compression. Specifically, the proposed framework includes a pair of ROI encoder and decoder convolutional neural networks and a learned entropy codec. The encoder network simultaneously generates multiscale representations that support efficient rate allocation and an implicit ROI mask that guides rate allocation. The proposed framework can automatically complete ROI image compression, and it can be optimized from data in an end-to-end manner. To effectively train the framework by back propagation, we develop a soft-to-hard ROI prediction scheme to make the entire framework differential. To improve visual quality, we propose a hierarchical distortion loss function to protect both pixel-level fidelity for ROI and structural similarity for the entire image. The proposed framework is implemented in two scenarios: salient-target and face-target ROI compression. Comparative experiments demonstrate the advantages of the proposed framework over the traditional framework, including considerably better subjective visual quality, significantly higher objective ROI compression performance and execution efficiency.
引用
收藏
页码:3442 / 3457
页数:16
相关论文
共 66 条
  • [1] Abadi M., 2016, TENSORFLOW LARGE SCA
  • [2] Agustsson E, 2017, ADV NEUR IN, V30
  • [3] Generative Adversarial Networks for Extreme Learned Image Compression
    Agustsson, Eirikur
    Tschannen, Michael
    Mentzer, Fabian
    Timofte, Radu
    Van Gool, Luc
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 221 - 231
  • [4] [Anonymous], 2018, CVPR WORKSH
  • [5] [Anonymous], 2001, ITU T VCEG M
  • [6] [Anonymous], P ADV NEUR INF PROC
  • [7] [Anonymous], J REAL TIME IMAGE PR
  • [8] [Anonymous], 2016, P INT C LEARN REPR
  • [9] [Anonymous], 2015, PROC CVPR IEEE
  • [10] [Anonymous], 2018, ECCV