Fast Algorithm for Depth Map Intra-Frame Coding 3D-HEVC Based on Swin Transformer and Multi-Branch Network

被引：0

作者：

Wang, Fengqin ^{[1
]}

Du, Yangang ^{[1
]}

Zhang, Qiuwen ^{[1
]}

机构：

[1] Zhengzhou Univ Light Ind, Coll Comp & Commun Engn, Zhengzhou 450002, Peoples R China

来源：

ELECTRONICS | 2025年 / 14卷 / 09期

基金：

中国国家自然科学基金;

关键词：

3D-HEVC; depth map encoding; swin transformer; recursive hierarchical; VIDEO; MULTIVIEW;

D O I：

10.3390/electronics14091703

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Three-Dimensional High-Efficiency Video Coding (3D-HEVC) efficiently compresses 3D video by incorporating depth map encoding techniques. However, the quadtree partitioning of depth map coding units (CUs) greatly increases computational complexity, contributing to over 90% of the total encoding time. To overcome the limitations of existing methods in complex edge modeling and partitioning efficiency, this paper presents Swin-Hier Net, a hierarchical CU partitioning prediction model based on the Swin Transformer. First, a multi-branch feature fusion architecture is designed: the Swin Transformer's shifted window attention mechanism extracts global contextual features, lightweight CNNs capture local texture details, and traditional edge/variance features enhance multi-scale representation. Second, a recursive hierarchical decision mechanism dynamically activates sub-CU prediction branches based on the partitioning probability of parent nodes, ensuring strict compliance with the HEVC standard quadtree syntax. Additionally, a hybrid pooling strategy and dilated convolutions improve edge feature retention. Experiments on 3D-HEVC standard test sequences show that, compared to exhaustive traversal methods, the proposed algorithm reduces encoding time by 72.7% on average, lowers the BD-Rate by 1.16%, improves CU partitioning accuracy to 94.5%, and maintains a synthesized view PSNR of 38.68 dB (baseline: 38.72 dB). The model seamlessly integrates into the HTM encoder, offering an efficient solution for real-time 3D video applications.

引用

页数：18

共 41 条

[1] Early Termination of CU Partition Based on Boosting Neural Network for 3D-HEVC Inter-Coding [J].

Bakkouri, Siham ;

Elyousfi, Abderrahmane .

IEEE ACCESS, 2022, 10 :13870-13883

[2]

Chen H, 2018, IEEE IMAGE PROC, P1777, DOI 10.1109/ICIP.2018.8451344

[3] A gaze-driven manufacturing assembly assistant system with integrated step recognition, repetition analysis, and real-time feedback [J].

Chen, Haodong ;

Zendehdel, Niloofar ;

Leu, Ming C. ;

Yin, Zhaozheng .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 144

[4] Fast CU Size Decision based on AQ-CNN for Depth Intra Coding in 3D-HEVC [J].

Chen, Yamei ;

Yu, Li ;

Li, Tiansong ;

Wang, Hongkui ;

Wang, Shengwei .

2019 DATA COMPRESSION CONFERENCE (DCC), 2019, :561-561

[5] Efficient Depth Intra Frame Coding in 3D-HEVC by Corner Points [J].

Fu, Chang-Hong ;

Chan, Yui-Lam ;

Zhang, Hong-Bin ;

Tsang, Sik Ho ;

Xu, Meng-Ting .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 :1608-1622

[6] Fast Depth Intra Coding Based on Decision Tree in 3D-HEVC [J].

Fu, Chang-Hong ;

Chen, Hao ;

Chan, Yui-Lam ;

Tsang, Sik-Ho ;

Hong, Hong ;

Zhu, Xiaohua .

IEEE ACCESS, 2019, 7 :173138-173147

[7] Early termination for fast intra mode decision in depth map coding using DIS-inheritance [J].

Fu, Chang-Hong ;

Chen, Hao ;

Chan, Yui-Lam ;

Tsang, Sik-Ho ;

Zhu, Xiaohua .

SIGNAL PROCESSING-IMAGE COMMUNICATION, 2020, 80

[8] Fast Rate-Distortion Optimization for Depth Maps in 3-D Video Coding [J].

Huo, Junyan ;

Zhou, Xile ;

Yuan, Hui ;

Wan, Shuai ;

Yang, Fuzheng .

IEEE TRANSACTIONS ON BROADCASTING, 2023, 69 (01) :21-32

[9] Fast HEVC intra-CU decision partition algorithm with modified LeNet-5 and AlexNet [J].

Imen, Werda ;

Amna, Maraoui ;

Fatma, Belghith ;

Ezahra, Sayadi Fatma ;

Masmoudi, Nouri .

SIGNAL IMAGE AND VIDEO PROCESSING, 2022, 16 (07) :1811-1819

[10]

Kim M, 2015, IEEE INT CONF MULTI

← 1 2 3 4 5 →