A Transformer-Based Image-Guided Depth-Completion Model with Dual-Attention Fusion Module

被引:0
|
作者
Wang, Shuling [1 ]
Jiang, Fengze [1 ]
Gong, Xiaojin [1 ]
机构
[1] Zhejiang Univ, Coll Informat Sci & Elect Engn, Hangzhou 310027, Peoples R China
关键词
depth completion; dual-attention fusion module; multi-scale dual branch; NETWORK; PROPAGATION;
D O I
10.3390/s24196270
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
Depth information is crucial for perceiving three-dimensional scenes. However, depth maps captured directly by depth sensors are often incomplete and noisy, our objective in the depth-completion task is to generate dense and accurate depth maps from sparse depth inputs by fusing guidance information from corresponding color images obtained from camera sensors. To address these challenges, we introduce transformer models, which have shown great promise in the field of vision, into the task of image-guided depth completion. By leveraging the self-attention mechanism, we propose a novel network architecture that effectively meets these requirements of high accuracy and resolution in depth data. To be more specific, we design a dual-branch model with a transformer-based encoder that serializes image features into tokens step by step and extracts multi-scale pyramid features suitable for pixel-wise dense prediction tasks. Additionally, we incorporate a dual-attention fusion module to enhance the fusion between the two branches. This module combines convolution-based spatial and channel-attention mechanisms, which are adept at capturing local information, with cross-attention mechanisms that excel at capturing long-distance relationships. Our model achieves state-of-the-art performance on both the NYUv2 depth and SUN-RGBD depth datasets. Additionally, our ablation studies confirm the effectiveness of the designed modules.
引用
收藏
页数:21
相关论文
共 36 条
  • [21] A superior image inpainting scheme using Transformer-based self-supervised attention GAN model
    Zhou, Meili
    Liu, Xiangzhen
    Yi, Tingting
    Bai, Zongwen
    Zhang, Pei
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 233
  • [22] Built-Up Area Extraction from GF-3 SAR Data Based on a Dual-Attention Transformer Model
    Li, Tianyang
    Wang, Chao
    Wu, Fan
    Zhang, Hong
    Tian, Sirui
    Fu, Qiaoyan
    Xu, Lu
    REMOTE SENSING, 2022, 14 (17)
  • [23] Feature Fusion Network Model Based on Dual Attention Mechanism for Hyperspectral Image Classification
    Cui, Ying
    Li, WenShan
    Chen, Liwei
    Wang, Liguo
    Jiang, Jing
    Gao, Shan
    IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, 2023, 61
  • [24] TPFR-Net: U-shaped model for lung nodule segmentation based on transformer pooling and dual-attention feature reorganization
    Li, Xiaotian
    Jiang, Ailian
    Qiu, Yanfang
    Li, Mengyang
    Zhang, Xinyue
    Yan, Shuotian
    MEDICAL & BIOLOGICAL ENGINEERING & COMPUTING, 2023, 61 (08) : 1929 - 1946
  • [25] Interlayer information fusion-based and dual-attention improved U-Net for ABVS image sequence intelligent tumor segmentation
    Yang, Xinwu
    Li, Xuanbo
    Qin, Yuanyuan
    Wang, Hui
    Zhao, Congrui
    Yin, Yiqin
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2024, 98
  • [26] TM-GAN: A Transformer-Based Multi-Modal Generative Adversarial Network for Guided Depth Image Super-Resolution
    Zhu, Jiang
    Koh, Van Kwan Zhi
    Lin, Zhiping
    Wen, Bihan
    IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS, 2024, 14 (02) : 261 - 274
  • [27] TPFR-Net: U-shaped model for lung nodule segmentation based on transformer pooling and dual-attention feature reorganization
    Xiaotian Li
    Ailian Jiang
    Yanfang Qiu
    Mengyang Li
    Xinyue Zhang
    Shuotian Yan
    Medical & Biological Engineering & Computing, 2023, 61 : 1929 - 1946
  • [28] A Statistical Motion Model Based on Biomechanical Simulations for Data Fusion during Image-Guided Prostate Interventions
    Hu, Yipeng
    Morgan, Dominic
    Ahmed, Hashim Uddin
    Pendse, Doug
    Sahu, Mahua
    Allen, Clare
    Emberton, Mark
    Hawkes, David
    Barratt, Dean
    MEDICAL IMAGE COMPUTING AND COMPUTER-ASSISTED INTERVENTION - MICCAI 2008, PT I, PROCEEDINGS, 2008, 5241 : 737 - +
  • [29] MSF-TransUNet: A Multi-Scale Feature Fusion Transformer-Based U-Net for Medical Image Segmentation with Uniform Attention
    Jiang, Ying
    Gong, Lejun
    Huang, Hao
    Qi, Mingming
    Traitement du Signal, 2025, 42 (01) : 531 - 540
  • [30] Post-flood disaster damaged houses classification based on dual-view image fusion and Concentration-Based Attention Module
    Wu, Luyuan
    Tong, Jingbo
    Wang, Zifa
    Li, Jianhui
    Li, Meng
    Li, Hui
    Feng, Yi
    SUSTAINABLE CITIES AND SOCIETY, 2024, 103