Deep Reference Frame Generation Method for VVC Inter Prediction Enhancement

被引：2

作者：

Jia, Jianghao ^{[1
]}

Zhang, Yuantong ^{[1
]}

Zhu, Han ^{[1
]}

Chen, Zhenzhong ^{[1
]}

Liu, Zizheng ^{[2
]}

Xu, Xiaozhong ^{[3
]}

Liu, Shan ^{[3
]}

机构：

[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430072, Peoples R China

[2] Tencent Shenzhen, Shenzhen 518000, Peoples R China

[3] Tencent Amer, Palo Alto, CA 94306 USA

来源：

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY | 2024年 / 34卷 / 05期

关键词：

Interpolation; Optical flow; Extrapolation; Bidirectional control; Kernel; Encoding; Streaming media; Neural-network-based video coding; versatile video coding (VVC); inter prediction; deep learning; NETWORK;

D O I：

10.1109/TCSVT.2023.3299410

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

In video coding, inter prediction aims to reduce temporal redundancy by using previously encoded frames as references. The quality of reference frames is crucial to the performance of inter prediction. This paper presents a deep reference frame generation method to optimize the inter prediction in Versatile Video Coding (VVC). Specifically, reconstructed frames are sent to a well-designed frame generation network to synthesize a picture similar to the current encoding frame. The synthesized picture serves as an additional reference frame inserted into the reference picture list (RPL) to provide a more reliable reference for subsequent motion estimation (ME) and motion compensation (MC). The frame generation network employs optical flow to predict motion precisely. Moreover, an optical flow reorganization strategy is proposed to enable bi-directional and uni-directional predictions with only a single network architecture. To reasonably apply our method to VVC, we further introduce a normative modification of the temporal motion vector prediction (TMVP). Integrated into the VVC reference software VTM-15.0, the deep reference frame generation method achieves coding efficiency improvements of 5.22%, 3.61%, and 3.83% for the Y component under random access (RA), low delay B (LDB), and low delay P (LDP) configurations, respectively. The proposed method has been discussed in Joint Video Exploration Team (JVET) meeting and is currently part of Exploration Experiments (EE) for further study.

引用

页码：3111 / 3124

页数：14

共 50 条

[21] ENHANCED CTU-LEVEL INTER PREDICTION WITH DEEP FRAME RATE UP-CONVERSION FOR HIGH EFFICIENCY VIDEO CODING
Zhao, Lei
Wang, Shiqi
Zhang, Xinfeng
Wang, Shanshe
Ma, Siwei
Gao, Wen
2018 25TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2018, : 206 - 210
[22] Inter-Frame Dependency-Based Rate Control for VVC Low-Delay Coding
Liu, Hewei
Zhu, Shuyuan
Zeng, Bing
IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 2727 - 2731
[23] A CNN-Based Prediction-Aware Quality Enhancement Framework for VVC
Nasiri, Fatemeh
Hamidouche, Wassim
Morin, Luce
Dhollande, Nicolas
Cocherel, Gildas
IEEE OPEN JOURNAL OF SIGNAL PROCESSING, 2021, 2 : 466 - 483
[24] Deep Frame Prediction for Video Coding
Choi, Hyomin
Bajic, Ivan V.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2020, 30 (07) : 1843 - 1855
[25] 3D Clothed Human Body Generation Method Based on Inter-Frame Motion Prediction of 2D Images
Liu, Shaojiang
Xu, Zhiming
Zheng, Zhijun
Zhang, Jinting
Li, Danyu
Qiu, Zemin
IEEE ACCESS, 2024, 12 : 47146 - 47154
[26] Joint reference frame synthesis and post filter enhancement for Versatile Video Coding
Bao, Weijie
Zhang, Yuantong
Jia, Jianghao
Chen, Zhenzhong
Liu, Shan
Journal of Visual Communication and Image Representation, 2025, 108
[27] Neural Network-Based Reference Block Quality Enhancement for Motion Compensation Prediction
Chu, Yanhan
Yuan, Hui
Jiang, Shiqi
Fu, Congrui
APPLIED SCIENCES-BASEL, 2023, 13 (05):
[28] Reference Clip for Inter Prediction in Video Coding
Ma, Changyue
Liu, Dong
Peng, Xiulian
Wu, Feng
Li, Houqiang
Wang, Tingting
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2019, 29 (01) : 130 - 143
[29] Disparity-Aware Reference Frame Generation Network for Multiview Video Coding
Lei, Jianjun
Zhang, Zongqian
Pan, Zhaoqing
Liu, Dong
Liu, Xiangrui
Chen, Ying
Ling, Nam
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4515 - 4526
[30] AN OBJECT-ORIENTED DEEP LEARNING METHOD FOR VIDEO FRAME PREDICTION<bold> </bold>
Mokssit, Saad
Licea, Daniel Bonilla
Guermah, Bassma
Ghogho, Mounir
2024 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING WORKSHOPS, ICASSPW 2024, 2024, : 398 - 402

← 1 2 3 4 5 →