Learned Hierarchical B-frame Coding with Adaptive Feature Modulation for YUV 4:2:0 Content

被引：0

作者：

Chen, Mu-Jung ^{[1
]}

Xie, Hong-Sheng ^{[1
]}

Chien, Cheng ^{[1
]}

Peng, Wen-Hsiao ^{[1
]}

Hang, Hsueh-Ming ^{[2
]}

机构：

[1] Natl Yang Ming Chiao Tung Univ, Comp Sci Dept, Hsinchu, Taiwan

[2] Natl Yang Ming Chiao Tung Univ, Elect Engn Dept, Hsinchu, Taiwan

来源：

2023 IEEE INTERNATIONAL SYMPOSIUM ON CIRCUITS AND SYSTEMS, ISCAS | 2023年

关键词：

video compression; YUV 4:2:0 format; variable rate; adaptive coding;

D O I：

10.1109/ISCAS46773.2023.10181948

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper introduces a learned hierarchical B-frame coding scheme in response to the Grand Challenge on Neural Network-based Video Coding at ISCAS 2023. We address specifically three issues, including (1) B-frame coding, (2) YUV 4:2:0 coding, and (3) content-adaptive variable-rate coding with only one single model. Most learned video codecs operate internally in the RGB domain for P-frame coding. Bframe coding for YUV 4:2:0 content is largely under-explored. In addition, while there have been prior works on variable-rate coding with conditional convolution, most of them fail to consider the content information. We build our scheme on conditional augmented normalized flows (CANF). It features conditional motion and inter-frame codecs for efficient B-frame coding. To cope with YUV 4:2:0 content, two conditional inter-frame codecs are used to process the Y and UV components separately, with the coding of the UV components conditioned additionally on the Y component. Moreover, we introduce adaptive feature modulation in every convolutional layer, taking into account both the content information and the coding levels of B-frames to achieve content-adaptive variable-rate coding. Experimental results show that our model outperforms x265 and the winner of last year's challenge on commonly used datasets in terms of PSNR-YUV.

引用

页数：5

共 34 条

[1]

Agustsson E, 2020, PROC CVPR IEEE, P8500, DOI 10.1109/CVPR42600.2020.00853

[2]

[Anonymous], HM reference software svn repository

[3]

[Anonymous], 2021, GRAND CHALL NEUR NET

[4]

[Anonymous], 2022, 2 GRAND CHALLENGE NE

[5]

Bossen F., 2013, JCTVCL1100, V12

[6]

Cui Ze, 2020, G-VAE: A continuously variable rate deep image compression framework

[7]

Diederik J. B., 2015, INT C LEARN REPR

[8]

Djelouah Abdelaziz, 2019, P IEEE CVF INT C COM, P1

[9] Transform Network Architectures for Deep Learning Based End-to-End Image/Video Coding in Subsampled Color Spaces [J].

Egilmez, Hilmi E. ;

Singh, Ankitesh K. ;

Coban, Muhammed ;

Karczewicz, Marta ;

Zhu, Yinhao ;

Yang, Yang ;

Said, Amir ;

Cohen, Taco S. .

IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2021, 2 :441-452

[10]

Ffmpeg, About us

← 1 2 3 4 →