ANALYSIS OF NEURAL IMAGE COMPRESSION NETWORKS FOR MACHINE-TO-MACHINE COMMUNICATION

被引:7
作者
Fischer, Kristian [1 ]
Forsch, Christian [1 ]
Herglotz, Christian [1 ]
Kaup, Andre [1 ]
机构
[1] Friedrich Alexander Univ Erlangen Nurnberg FAU, Multimedia Commun & Signal Proc, Cauerstr 7, D-91058 Erlangen, Germany
来源
2021 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | 2021年
关键词
Neural Compression Networks; Video Coding for Machines; Machine-to-Machine Communication;
D O I
10.1109/ICIP42928.2021.9506763
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Video and image coding for machines (VCM) is an emerging field that aims to develop compression methods resulting in optimal bitstreams when the decoded frames are analyzed by a neural network. Several approaches already exist improving classic hybrid codecs for this task. However, neural compression networks (NCNs) have made an enormous progress in coding images over the last years. Thus, it is reasonable to consider such NCNs, when the information sink at the decoder side is a neural network as well. Therefore, we build-up an evaluation framework analyzing the performance of four state-of-the-art NCNs, when a Mask R-CNN is segmenting objects from the decoded image. The compression performance is measured by the weighted average precision for the Cityscapes dataset. Based on that analysis, we find that networks with leaky ReLU as non-linearity and training with SSIM as distortion criteria results in the highest coding gains for the VCM task. Furthermore, it is shown that the GAN-based NCN architecture achieves the best coding performance and even out-performs the recently standardized Versatile Video Coding (VVC) for the given scenario.
引用
收藏
页码:2079 / 2083
页数:5
相关论文
共 29 条
[1]  
[Anonymous], 2020, P IEEE INT WORKSH MU
[2]  
Bagdanov A. D., 2011, Proceedings of the 2011 IEEE International Symposium on Multimedia (ISM 2011), P190, DOI 10.1109/ISM.2011.38
[3]  
Balle J, Tensorflow-compression: Data compression in tensorflow
[4]  
Ballé J, 2018, PICT COD SYMP, P248, DOI 10.1109/PCS.2018.8456272
[5]  
Balle Johannes, 2016, P INT C LEARN REPR
[6]  
Balle Johannes, 2018, arXiv preprint arXiv:1802.01436
[7]  
Balle Johannes, 2017, INT C LEARN REPR
[8]  
Bjontegaard G., 2001, VCEGMM33 ITU T VCEG
[9]  
Bradski G., 2000, Opencv. Dr. Dobb's Journal of Software Tools
[10]  
Chen J., 2020, ITUTSG16WP3