Efficient inter partitioning of versatile video coding based on supervised contrastive learning

被引:0
作者
Lin, JieLian [1 ,3 ]
Lin, Hongbin [1 ]
Zhang, Zhichen [1 ]
Xu, Yiwen [1 ,2 ]
机构
[1] Fuzhou Univ, Fujian Key Lab Intelligent Proc & Wireless Transmi, Fuzhou, Peoples R China
[2] Fuzhou Univ, Coll Zhicheng, Fuzhou 350002, Peoples R China
[3] Putian Univ, Sch Mech & Elect Informat Engn, Putian 351100, Fujian, Peoples R China
关键词
Versatile video coding; Inter prediction; Supervised contrastive learning; Complexity optimization; CU SIZE DECISION; MODE DECISION; PREDICTION; SELECTION; LEVEL;
D O I
10.1016/j.knosys.2024.111902
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Nowadays, Versatile Video Coding (VVC) has achieved a superior performance than previous video coding standard (High Efficiency Video Coding). The Quadtree with Nested Multi-Type Tree (QTMT) coding block structure can enhance the coding performance. Nevertheless, this technique also leads to the significantly increasing complexity of VVC inter coding. Therefore, complexity optimization is an urgent problem to be optimized in the market application of VVC. To solve this issue, we propose a Supervised-Contrastive-Learningbased Inter Partitioning (SCLIP) method in this paper. Firstly, we define the above complexity optimization problem as a supervised classification task. Next, we develop a SCLIP Estimation Network (SCLIPEst-Net) with a supervised contrastive learning module and a classification module. After training on a newly established dataset, the SCLIPEst-Net can reasonably predict the mode partitioning. Finally, we propose an overall SCLIP algorithm that effectively determines the inter partitions of VVC with a low computational overhead. Experimental results indicate that our method achieves 45.14% average Time Saving (TS) with a 2.40% Bj & oslash;ntegaard Delta Bit Rate (BDBR) in Random Access (RA), outperforming the benchmarks.
引用
收藏
页数:9
相关论文
共 54 条
[1]   Tunable VVC Frame Partitioning Based on Lightweight Machine Learning [J].
Amestoy, Thomas ;
Mercat, Alexandre ;
Hamidouche, Wassim ;
Menard, Daniel ;
Bergeron, Cyril .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :1313-1328
[2]  
Bjontegaard G., 2001, document SG16 VCEG-M33
[3]  
Bossen B., 2018, JVET document, JVET-J1001
[4]  
Bossen F., 2019, VVC software VTM-6.0
[5]  
Bossen F., 2022, VVC software VTM-17.0
[6]   VVC Complexity and Software Implementation Analysis [J].
Bossen, Frank ;
Suehring, Karsten ;
Wieckowski, Adam ;
Liu, Shan .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (10) :3765-3778
[7]  
Boyce F.B.J., 2019, JVET document, JVET-N1010
[8]   Overview of the Versatile Video Coding (VVC) Standard and its Applications [J].
Bross, Benjamin ;
Wang, Ye-Kui ;
Ye, Yan ;
Liu, Shan ;
Chen, Jianle ;
Sullivan, Gary J. ;
Ohm, Jens-Rainer .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2021, 31 (10) :3736-3764
[9]   Efficient Partition Decision Based on Visual Perception and Machine Learning for H.266/Versatile Video Coding [J].
Chen, Mei-Juan ;
Lee, Cheng-An ;
Tsai, Yu-Hsiang ;
Yang, Chieh-Ming ;
Yeh, Chia-Hung ;
Kau, Lih-Jen ;
Chang, Chuan-Yu .
IEEE ACCESS, 2022, 10 :42127-42136
[10]  
Chen T, 2020, PR MACH LEARN RES, V119