Multi-feature fusion for efficient inter prediction in versatile video coding

被引：1

作者：

Wei, Xiaojie ^{[1
]}

Zeng, Hongji ^{[1
]}

Fang, Ying ^{[1
]}

Lin, Liqun ^{[1
]}

Chen, Weiling ^{[1
]}

Xu, Yiwen ^{[1
]}

机构：

[1] Fuzhou Univ, Fuzhou Coll Town, Fujian Key Lab Intelligent Proc & Wireless Transmi, 2 North Wulong River Ave, Fuzhou, Fujian, Peoples R China

来源：

JOURNAL OF REAL-TIME IMAGE PROCESSING | 2024年 / 21卷 / 06期

关键词：

Versatile video coding; Complexity optimization; Block partition; CNN; Multi-feature fusion; CU PARTITION; OPTIMIZATION; DECISION;

D O I：

10.1007/s11554-024-01564-z

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Versatile Video Coding (VVC) introduces various advanced coding techniques and tools, such as QuadTree with nested Multi-type Tree (QTMT) partition structure, and outperforms High Efficiency Video Coding (HEVC) in terms of coding performance. However, the improvement of coding performance leads to an increase in coding complexity. In this paper, we propose a multi-feature fusion framework that integrates the rate-distortion-complexity optimization theory with deep learning techniques to reduce the complexity of QTMT partition for VVC inter-prediction. Firstly, the proposed framework extracts features of luminance, motion, residuals, and quantization information from video frames and then performs feature fusion through a convolutional neural network to predict the minimum partition size of Coding Units (CUs). Next, a novel rate-distortion-complexity loss function is designed to balance computational complexity and compression performance. Then, through this loss function, we can adjust various distributions of rate-distortion-complexity costs. This adjustment impacts the prediction bias of the network and sets constraints on different block partition sizes to facilitate complexity adjustment. Compared to anchor VTM-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}13.0, the proposed method saves the encoding time by 10.14% to 56.62%, with BDBR increase confined to a range of 0.31% to 6.70%. The proposed method achieves a broader range of complexity adjustments while ensuring coding performance, surpassing both traditional methods and deep learning-based methods.

引用

页数：14

共 50 条

[1] Multi-feature fusion refine network for video captioning
Wang, Guan-Hong
Du, Ji-Xiang
Zhang, Hong-Bo
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (03) : 483 - 497
[2] Semantic Enhanced Video Captioning with Multi-feature Fusion
Niu, Tian-Zi
Dong, Shan-Shan
Chen, Zhen-Duo
Luo, Xin
Guo, Shanqing
Huang, Zi
Xu, Xin-Shun
ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
[3] Efficient inter partitioning of versatile video coding based on supervised contrastive learning
Lin, JieLian
Lin, Hongbin
Zhang, Zhichen
Xu, Yiwen
KNOWLEDGE-BASED SYSTEMS, 2024, 296
[4] Versatile Video Coding-Post Processing Feature Fusion: A Post-Processing Convolutional Neural Network with Progressive Feature Fusion for Efficient Video Enhancement
Das, Tanni
Liang, Xilong
Choi, Kiho
APPLIED SCIENCES-BASEL, 2024, 14 (18):
[5] HIERARCHICAL SPARSE CODING BASED ON SPATIAL POOLING AND MULTI-FEATURE FUSION
Weng, Chaoqun
Wang, Hongxing
Yuan, Junsong
2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,
[6] Computational prediction of allergenic proteins based on multi-feature fusion
Liu, Bin
Yang, Ziman
Liu, Qing
Zhang, Ying
Ding, Hui
Lai, Hongyan
Li, Qun
FRONTIERS IN GENETICS, 2023, 14
[7] Early Fire Recognition Based on Multi-Feature Fusion of Video Smoke
Wang, Lin
Li, Aiguo
PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 5318 - 5323
[8] Temporal Prediction Model-Based Fast Inter CU Partition for Versatile Video Coding
Li, Yue
Luo, Fei
Zhu, Yapei
SENSORS, 2022, 22 (20)
[9] A Fast VVC Intra Prediction Based on Gradient Analysis and Multi-Feature Fusion CNN
Jing, Zhiyong
Zhu, Wendi
Zhang, Qiuwen
ELECTRONICS, 2023, 12 (09)
[10] Pedestrian Crossing Intention Prediction Method Based on Multi-Feature Fusion
Ma, Jun
Rong, Wenhui
WORLD ELECTRIC VEHICLE JOURNAL, 2022, 13 (08):

← 1 2 3 4 5 →