Learned Hierarchical B-frame Coding with Adaptive Feature Modulation for YUV 4:2:0 Content

被引:0
作者
Chen, Mu-Jung [1 ]
Xie, Hong-Sheng [1 ]
Chien, Cheng [1 ]
Peng, Wen-Hsiao [1 ]
Hang, Hsueh-Ming [2 ]
机构
[1] Natl Yang Ming Chiao Tung Univ, Comp Sci Dept, Hsinchu, Taiwan
[2] Natl Yang Ming Chiao Tung Univ, Elect Engn Dept, Hsinchu, Taiwan
来源
2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS | 2023年
关键词
video compression; YUV 4:2:0 format; variable rate; adaptive coding;
D O I
10.1109/ISCAS46773.2023.10181948
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper introduces a learned hierarchical B-frame coding scheme in response to the Grand Challenge on Neural Network-based Video Coding at ISCAS 2023. We address specifically three issues, including (1) B-frame coding, (2) YUV 4:2:0 coding, and (3) content-adaptive variable-rate coding with only one single model. Most learned video codecs operate internally in the RGB domain for P-frame coding. Bframe coding for YUV 4:2:0 content is largely under-explored. In addition, while there have been prior works on variable-rate coding with conditional convolution, most of them fail to consider the content information. We build our scheme on conditional augmented normalized flows (CANF). It features conditional motion and inter-frame codecs for efficient B-frame coding. To cope with YUV 4:2:0 content, two conditional inter-frame codecs are used to process the Y and UV components separately, with the coding of the UV components conditioned additionally on the Y component. Moreover, we introduce adaptive feature modulation in every convolutional layer, taking into account both the content information and the coding levels of B-frames to achieve content-adaptive variable-rate coding. Experimental results show that our model outperforms x265 and the winner of last year's challenge on commonly used datasets in terms of PSNR-YUV.
引用
收藏
页数:5
相关论文
共 34 条
[1]  
Agustsson E, 2020, PROC CVPR IEEE, P8500, DOI 10.1109/CVPR42600.2020.00853
[2]  
[Anonymous], HM reference software svn repository
[3]  
[Anonymous], 2021, GRAND CHALL NEUR NET
[4]  
[Anonymous], 2022, 2 GRAND CHALLENGE NE
[5]  
Bossen F., 2013, JCTVCL1100, V12
[6]  
Cui Ze, 2020, G-VAE: A continuously variable rate deep image compression framework
[7]  
Diederik J. B., 2015, INT C LEARN REPR
[8]  
Djelouah Abdelaziz, 2019, P IEEE CVF INT C COM, P1
[9]   Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces [J].
Egilmez, Hilmi E. ;
Singh, Ankitesh K. ;
Coban, Muhammed ;
Karczewicz, Marta ;
Zhu, Yinhao ;
Yang, Yang ;
Said, Amir ;
Cohen, Taco S. .
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2021, 2 :441-452
[10]  
Ffmpeg, About us