Fast Algorithm for Depth Map Intra-Frame Coding 3D-HEVC Based on Swin Transformer and Multi-Branch Network

被引:0
作者
Wang, Fengqin [1 ]
Du, Yangang [1 ]
Zhang, Qiuwen [1 ]
机构
[1] Zhengzhou Univ Light Ind, Coll Comp & Commun Engn, Zhengzhou 450002, Peoples R China
基金
中国国家自然科学基金;
关键词
3D-HEVC; depth map encoding; swin transformer; recursive hierarchical; VIDEO; MULTIVIEW;
D O I
10.3390/electronics14091703
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Three-Dimensional High-Efficiency Video Coding (3D-HEVC) efficiently compresses 3D video by incorporating depth map encoding techniques. However, the quadtree partitioning of depth map coding units (CUs) greatly increases computational complexity, contributing to over 90% of the total encoding time. To overcome the limitations of existing methods in complex edge modeling and partitioning efficiency, this paper presents Swin-Hier Net, a hierarchical CU partitioning prediction model based on the Swin Transformer. First, a multi-branch feature fusion architecture is designed: the Swin Transformer's shifted window attention mechanism extracts global contextual features, lightweight CNNs capture local texture details, and traditional edge/variance features enhance multi-scale representation. Second, a recursive hierarchical decision mechanism dynamically activates sub-CU prediction branches based on the partitioning probability of parent nodes, ensuring strict compliance with the HEVC standard quadtree syntax. Additionally, a hybrid pooling strategy and dilated convolutions improve edge feature retention. Experiments on 3D-HEVC standard test sequences show that, compared to exhaustive traversal methods, the proposed algorithm reduces encoding time by 72.7% on average, lowers the BD-Rate by 1.16%, improves CU partitioning accuracy to 94.5%, and maintains a synthesized view PSNR of 38.68 dB (baseline: 38.72 dB). The model seamlessly integrates into the HTM encoder, offering an efficient solution for real-time 3D video applications.
引用
收藏
页数:18
相关论文
共 41 条
[1]   Early Termination of CU Partition Based on Boosting Neural Network for 3D-HEVC Inter-Coding [J].
Bakkouri, Siham ;
Elyousfi, Abderrahmane .
IEEE ACCESS, 2022, 10 :13870-13883
[2]  
Chen H, 2018, IEEE IMAGE PROC, P1777, DOI 10.1109/ICIP.2018.8451344
[3]   A gaze-driven manufacturing assembly assistant system with integrated step recognition, repetition analysis, and real-time feedback [J].
Chen, Haodong ;
Zendehdel, Niloofar ;
Leu, Ming C. ;
Yin, Zhaozheng .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144
[4]   Fast CU Size Decision based on AQ-CNN for Depth Intra Coding in 3D-HEVC [J].
Chen, Yamei ;
Yu, Li ;
Li, Tiansong ;
Wang, Hongkui ;
Wang, Shengwei .
2019 DATA COMPRESSION CONFERENCE (DCC), 2019, :561-561
[5]   Efficient Depth Intra Frame Coding in 3D-HEVC by Corner Points [J].
Fu, Chang-Hong ;
Chan, Yui-Lam ;
Zhang, Hong-Bin ;
Tsang, Sik Ho ;
Xu, Meng-Ting .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :1608-1622
[6]   Fast Depth Intra Coding Based on Decision Tree in 3D-HEVC [J].
Fu, Chang-Hong ;
Chen, Hao ;
Chan, Yui-Lam ;
Tsang, Sik-Ho ;
Hong, Hong ;
Zhu, Xiaohua .
IEEE ACCESS, 2019, 7 :173138-173147
[7]   Early termination for fast intra mode decision in depth map coding using DIS-inheritance [J].
Fu, Chang-Hong ;
Chen, Hao ;
Chan, Yui-Lam ;
Tsang, Sik-Ho ;
Zhu, Xiaohua .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 80
[8]   Fast Rate-Distortion Optimization for Depth Maps in 3-D Video Coding [J].
Huo, Junyan ;
Zhou, Xile ;
Yuan, Hui ;
Wan, Shuai ;
Yang, Fuzheng .
IEEE TRANSACTIONS ON BROADCASTING, 2023, 69 (01) :21-32
[9]   Fast HEVC intra-CU decision partition algorithm with modified LeNet-5 and AlexNet [J].
Imen, Werda ;
Amna, Maraoui ;
Fatma, Belghith ;
Ezahra, Sayadi Fatma ;
Masmoudi, Nouri .
SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (07) :1811-1819
[10]  
Kim M, 2015, IEEE INT CONF MULTI