TFDEPTH: SELF-SUPERVISED MONOCULARDEPTH ESTIMATION WITH MULITI-SCALE SELECTIVE TRANSFORMER FEATURE FUSION

被引：0

作者：

Hu, Hongli ^{[1
]}

Miao, Jun ^{[1
,2
]}

Zhu, Guanghu ^{[1
]}

Yan, Je ^{[2
]}

Chu, Jun ^{[3
]}

机构：

[1] Nanchang Hangkong Univ, Sch Aeronaut Mfg Engn, Nanchang, Peoples R China

[2] Chinese Acad Sci, Key Lab Lunar & Deep Space Explorat, Beijing, Peoples R China

[3] Nanchang Hangkong Univ, Key Lab Jiangxi Prov Image Proc & Pattern Recognit, Nanchang 330063, Peoples R China

来源：

IMAGE ANALYSIS & STEREOLOGY | 2024年 / 43卷 / 02期

关键词：

monocular depth estimation; multi-scale fusion; self-supervised learning; transformer;

D O I：

10.105566/ias.2987

中图分类号：

T [工业技术];

学科分类号：

08 ;

摘要：

Existing self -supervised models for monocular depth estimation suffer from issues such as discontinuity, blurred edges, and unclear contours, particularly for small objects. We propose a self -supervised monocular depth estimation network with multi -scale selective Transformer feature fusion. To preserve more detailed features, this paper constructs a multi -scale encoder to extract features and leverages the self -attention mechanism of Transformer to capture global contextual information, enabling better depth prediction for small objects. Additionally, the multi -scale selective fusion module (MSSF) is also proposed, which can make full use of multi -scale feature information in the decoding part and perform selective fusion step by step, which can effectively eliminate noise and retain local detail features to obtain a clear depth map with clear edges. Experimental evaluations on the KITTI dataset demonstrate that the proposed algorithm achieves an absolute relative error (Abs Rel) of 0.098 and an accuracy rate (delta) of 0.983. The results indicate that the proposed algorithm not only estimates depth values with high accuracy but also predicts the continuous depth map with clear edges.

引用

页码：139 / 149

页数：11

共 50 条

[21] Self-Supervised Monocular Depth Estimation Using Hybrid Transformer Encoder
Hwang, Seung-Jun
Park, Sung-Jun
Baek, Joong-Hwan
Kim, Byungkyu
IEEE SENSORS JOURNAL, 2022, 22 (19) : 18762 - 18770
[22] An Emotion Recognition Method Based On Feature Fusion and Self-Supervised Learning
Cao, Xuanmeng
Sun, Ming
2023 2ND ASIA CONFERENCE ON ALGORITHMS, COMPUTING AND MACHINE LEARNING, CACML 2023, 2023, : 216 - 221
[23] A Self-Supervised Residual Feature Learning Model for Multifocus Image Fusion
Wang, Zeyu
Li, Xiongfei
Duan, Haoran
Zhang, Xiaoli
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2022, 31 : 4527 - 4542
[24] Self-supervised Depth Estimation based on Feature Sharing and Consistency Constraints
Mendoza, Julio
Pedrini, Helio
PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON COMPUTER VISION, IMAGING AND COMPUTER GRAPHICS THEORY AND APPLICATIONS, VOL 5: VISAPP, 2020, : 134 - 141
[25] Self-Supervised Monocular Depth Estimation Using HOG Feature Prediction
He, Xin
Zhao, Xiao
PROCEEDINGS OF 2024 INTERNATIONAL CONFERENCE ON COMPUTER AND MULTIMEDIA TECHNOLOGY, ICCMT 2024, 2024, : 382 - 387
[26] Depth Estimation Using a Self-Supervised Network Based on Cross-Layer Feature Fusion and the Quadtree Constraint
Tian, Fangzheng
Gao, Yongbin
Fang, Zhijun
Fang, Yuming
Gu, Jia
Fujita, Hamido
Hwang, Jenq-Neng
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (04) : 1751 - 1766
[27] Decoupled spatiotemporal adaptive fusion network for self-supervised motion estimation
Sun, Zitang
Luo, Zhengbo
Nishida, Shin'ya
NEUROCOMPUTING, 2023, 534 : 133 - 146
[28] Self-supervised monocular Depth estimation with multi-scale structure similarity loss
Chenggong Han
Deqiang Cheng
Qiqi Kou
Xiaoyi Wang
Liangliang Chen
Jiamin Zhao
Multimedia Tools and Applications, 2023, 82 : 38035 - 38050
[29] Self-supervised monocular Depth estimation with multi-scale structure similarity loss
Han, Chenggong
Cheng, Deqiang
Kou, Qiqi
Wang, Xiaoyi
Chen, Liangliang
Zhao, Jiamin
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 82 (24) : 38035 - 38050
[30] Self-supervised monocular depth and ego-motion estimation for CT-bronchoscopy fusion
Chang, Qi
Higgins, William E.
IMAGE-GUIDED PROCEDURES, ROBOTIC INTERVENTIONS, AND MODELING, MEDICAL IMAGING 2024, 2024, 12928

← 1 2 3 4 5 →