Multi-feature fusion for efficient inter prediction in versatile video coding

被引:1
|
作者
Wei, Xiaojie [1 ]
Zeng, Hongji [1 ]
Fang, Ying [1 ]
Lin, Liqun [1 ]
Chen, Weiling [1 ]
Xu, Yiwen [1 ]
机构
[1] Fuzhou Univ, Fuzhou Coll Town, Fujian Key Lab Intelligent Proc & Wireless Transmi, 2 North Wulong River Ave, Fuzhou, Fujian, Peoples R China
关键词
Versatile video coding; Complexity optimization; Block partition; CNN; Multi-feature fusion; CU PARTITION; OPTIMIZATION; DECISION;
D O I
10.1007/s11554-024-01564-z
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Versatile Video Coding (VVC) introduces various advanced coding techniques and tools, such as QuadTree with nested Multi-type Tree (QTMT) partition structure, and outperforms High Efficiency Video Coding (HEVC) in terms of coding performance. However, the improvement of coding performance leads to an increase in coding complexity. In this paper, we propose a multi-feature fusion framework that integrates the rate-distortion-complexity optimization theory with deep learning techniques to reduce the complexity of QTMT partition for VVC inter-prediction. Firstly, the proposed framework extracts features of luminance, motion, residuals, and quantization information from video frames and then performs feature fusion through a convolutional neural network to predict the minimum partition size of Coding Units (CUs). Next, a novel rate-distortion-complexity loss function is designed to balance computational complexity and compression performance. Then, through this loss function, we can adjust various distributions of rate-distortion-complexity costs. This adjustment impacts the prediction bias of the network and sets constraints on different block partition sizes to facilitate complexity adjustment. Compared to anchor VTM-\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$-$$\end{document}13.0, the proposed method saves the encoding time by 10.14% to 56.62%, with BDBR increase confined to a range of 0.31% to 6.70%. The proposed method achieves a broader range of complexity adjustments while ensuring coding performance, surpassing both traditional methods and deep learning-based methods.
引用
收藏
页数:14
相关论文
共 50 条
  • [1] Multi-feature fusion refine network for video captioning
    Wang, Guan-Hong
    Du, Ji-Xiang
    Zhang, Hong-Bo
    JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 2022, 34 (03) : 483 - 497
  • [2] Semantic Enhanced Video Captioning with Multi-feature Fusion
    Niu, Tian-Zi
    Dong, Shan-Shan
    Chen, Zhen-Duo
    Luo, Xin
    Guo, Shanqing
    Huang, Zi
    Xu, Xin-Shun
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2023, 19 (06)
  • [3] Efficient inter partitioning of versatile video coding based on supervised contrastive learning
    Lin, JieLian
    Lin, Hongbin
    Zhang, Zhichen
    Xu, Yiwen
    KNOWLEDGE-BASED SYSTEMS, 2024, 296
  • [4] Versatile Video Coding-Post Processing Feature Fusion: A Post-Processing Convolutional Neural Network with Progressive Feature Fusion for Efficient Video Enhancement
    Das, Tanni
    Liang, Xilong
    Choi, Kiho
    APPLIED SCIENCES-BASEL, 2024, 14 (18):
  • [5] HIERARCHICAL SPARSE CODING BASED ON SPATIAL POOLING AND MULTI-FEATURE FUSION
    Weng, Chaoqun
    Wang, Hongxing
    Yuan, Junsong
    2013 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME 2013), 2013,
  • [6] Computational prediction of allergenic proteins based on multi-feature fusion
    Liu, Bin
    Yang, Ziman
    Liu, Qing
    Zhang, Ying
    Ding, Hui
    Lai, Hongyan
    Li, Qun
    FRONTIERS IN GENETICS, 2023, 14
  • [7] Early Fire Recognition Based on Multi-Feature Fusion of Video Smoke
    Wang, Lin
    Li, Aiguo
    PROCEEDINGS OF THE 36TH CHINESE CONTROL CONFERENCE (CCC 2017), 2017, : 5318 - 5323
  • [8] Temporal Prediction Model-Based Fast Inter CU Partition for Versatile Video Coding
    Li, Yue
    Luo, Fei
    Zhu, Yapei
    SENSORS, 2022, 22 (20)
  • [9] A Fast VVC Intra Prediction Based on Gradient Analysis and Multi-Feature Fusion CNN
    Jing, Zhiyong
    Zhu, Wendi
    Zhang, Qiuwen
    ELECTRONICS, 2023, 12 (09)
  • [10] Pedestrian Crossing Intention Prediction Method Based on Multi-Feature Fusion
    Ma, Jun
    Rong, Wenhui
    WORLD ELECTRIC VEHICLE JOURNAL, 2022, 13 (08):