Multi-feature fusion for efficient inter prediction in versatile video coding

被引:1
作者
Wei, Xiaojie [1 ]
Zeng, Hongji [1 ]
Fang, Ying [1 ]
Lin, Liqun [1 ]
Chen, Weiling [1 ]
Xu, Yiwen [1 ]
机构
[1] Fuzhou Univ, Fuzhou Coll Town, Fujian Key Lab Intelligent Proc & Wireless Transmi, 2 North Wulong River Ave, Fuzhou, Fujian, Peoples R China
关键词
Versatile video coding; Complexity optimization; Block partition; CNN; Multi-feature fusion; CU PARTITION; OPTIMIZATION; DECISION;
D O I
10.1007/s11554-024-01564-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Versatile Video Coding (VVC) introduces various advanced coding techniques and tools, such as QuadTree with nested Multi-type Tree (QTMT) partition structure, and outperforms High Efficiency Video Coding (HEVC) in terms of coding performance. However, the improvement of coding performance leads to an increase in coding complexity. In this paper, we propose a multi-feature fusion framework that integrates the rate-distortion-complexity optimization theory with deep learning techniques to reduce the complexity of QTMT partition for VVC inter-prediction. Firstly, the proposed framework extracts features of luminance, motion, residuals, and quantization information from video frames and then performs feature fusion through a convolutional neural network to predict the minimum partition size of Coding Units (CUs). Next, a novel rate-distortion-complexity loss function is designed to balance computational complexity and compression performance. Then, through this loss function, we can adjust various distributions of rate-distortion-complexity costs. This adjustment impacts the prediction bias of the network and sets constraints on different block partition sizes to facilitate complexity adjustment. Compared to anchor VTM-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}13.0, the proposed method saves the encoding time by 10.14% to 56.62%, with BDBR increase confined to a range of 0.31% to 6.70%. The proposed method achieves a broader range of complexity adjustments while ensuring coding performance, surpassing both traditional methods and deep learning-based methods.
引用
收藏
页数:14
相关论文
共 50 条
[21]   A Novel Multi-Feature Fusion and Sparse Coding-Based Framework for Image Retrieval [J].
Chen, Qiaosong ;
Ding, Yuanyuan ;
Li, Hai ;
Wang, Xi ;
Wang, Jin ;
Deng, Xin .
2014 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN AND CYBERNETICS (SMC), 2014, :2391-2396
[22]   Study on Video Based Fire Detection Algorithm using Multi-Feature Fusion Method [J].
Yang Manjiang ;
Rang Jianzhong ;
Wang Jian .
2011 3RD WORLD CONGRESS IN APPLIED COMPUTING, COMPUTER SCIENCE, AND COMPUTER ENGINEERING (ACC 2011), VOL 3, 2011, 3 :207-213
[23]   High Speed Front-Vehicle Detection Based on Video Multi-feature Fusion [J].
Xiong, Liliang ;
Yue, Wenjing ;
Xu, Qiushi ;
Zhu, Zhengtian ;
Chen, Zhi .
PROCEEDINGS OF 2020 IEEE 10TH INTERNATIONAL CONFERENCE ON ELECTRONICS INFORMATION AND EMERGENCY COMMUNICATION (ICEIEC 2020), 2020, :348-351
[24]   Background Modeling Algorithm for Multi-feature Fusion [J].
Guo, Zhicheng ;
Dang, Jianwu ;
Wang, Yangping ;
Jin, Jing .
2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, :1117-1121
[25]   Knowledge tracing based on multi-feature fusion [J].
Yongkang Xiao ;
Rong Xiao ;
Ning Huang ;
Yixin Hu ;
Huan Li ;
Bo Sun .
Neural Computing and Applications, 2023, 35 :1819-1833
[26]   Fusion-Based Versatile Video Coding Intra Prediction Algorithm with Template Matching and Linear Prediction [J].
Luo, Dan ;
Xiong, Shuhua ;
Ren, Chao ;
Sheriff, Raymond Edward ;
He, Xiaohai .
SENSORS, 2022, 22 (16)
[27]   Knowledge tracing based on multi-feature fusion [J].
Xiao, Yongkang ;
Xiao, Rong ;
Huang, Ning ;
Hu, Yixin ;
Li, Huan ;
Sun, Bo .
NEURAL COMPUTING & APPLICATIONS, 2023, 35 (02) :1819-1833
[28]   Multi-feature fusion target tracking algorithm [J].
Liang Hui-hui ;
He Qiu-sheng ;
Jia Wei-zhen ;
Zhang Wei-feng .
CHINESE JOURNAL OF LIQUID CRYSTALS AND DISPLAYS, 2020, 35 (06) :583-594
[29]   An Efficient Malware Classification Method Based on the AIFS-IDL and Multi-Feature Fusion [J].
Wu, Xuan ;
Song, Yafei .
INFORMATION, 2022, 13 (12)
[30]   Efficient Chroma Intra Prediction via Exemplar Colorization Network for Versatile Video Coding [J].
Pan, Zhaoqing ;
Chen, Jixing ;
Peng, Bo ;
Lei, Jianjun ;
Wang, Fu Lee ;
Ling, Nam ;
Kwong, Sam .
IEEE TRANSACTIONS ON MULTIMEDIA, 2025, 27 :4713-4724