HierN et: Hierarchical Transformer U -Shape Network for RGB-D Salient Object Detection

被引:1
作者
Lv, Pengfei [1 ]
Yu, Xiaosheng [1 ]
Wang, Junxiang [1 ]
Wu, Chengdong [1 ]
机构
[1] Northeastern Univ, Fac Robot Sci & Engn, Shenyang, Peoples R China
来源
2023 35TH CHINESE CONTROL AND DECISION CONFERENCE, CCDC | 2023年
基金
中国国家自然科学基金;
关键词
salient object detection; RGB-D; transformer; self-attention;
D O I
10.1109/CCDC58219.2023.10327419
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
With the popularity of depth sensors, research on RGB-D salient object detection (SOD) is also thriving. However, given the limitations of the external environment and the sensor itself, depth information is often less credible. To meet this challenge, existing models often purify the depth information using complex convolution and pooling operations. This causes a large amount of useful information besides noise to be dropped as well, and multi-modality interaction chances between RGB and depth become less. Also, with the gradual loss of information, the hidden relationship of features between mult-ilevel is thus ignored. To tackle the aforementioned problems, we propose a Hierarchical Transformer U-Shape Network (HierNet) that include three aspects: 1) With a simple structure, a depth calibration module provides faithful depth information with minimal loss of information, providing conditions for cross modality cross -layer information interaction; 2) With multi-head attention, a set of global view -based transformer encoders are employed to find the potential coherence between RGB and depth modalities. With weight sharing, several transformer encoder sets comprise the hierarchical transformer embedding module to search long-range dependencies cross -level; 3) Considering the complementary features of U -shape network, we use dual-stream U -shape network as our backbone. Extensive fair experiments on four challenging datasets have demonstrated the outstanding performance of the proposed model compared to state-of-the-art models.
引用
收藏
页码:1807 / 1811
页数:5
相关论文
共 31 条
  • [11] Ju R, 2014, IEEE IMAGE PROC, P1115, DOI 10.1109/ICIP.2014.7025222
  • [12] Kingma DP, 2014, ADV NEUR IN, V27
  • [13] Hierarchical Alternate Interaction Network for RGB-D Salient Object Detection
    Li, Gongyang
    Liu, Zhi
    Chen, Minyu
    Bai, Zhen
    Lin, Weisi
    Ling, Haibin
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 3528 - 3542
  • [14] Stacked U-Shape Network With Channel-Wise Attention for Salient Object Detection
    Li, Junxia
    Pan, Zefeng
    Liu, Qingshan
    Wang, Ziyang
    [J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2021, 23 : 1397 - 1409
  • [15] Saliency Detection on Light Field
    Li, Nianyi
    Ye, Jinwei
    Ji, Yu
    Ling, Haibin
    Yu, Jingyi
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 2806 - 2813
  • [16] Liu Z., 2021, P IEEE INT C COMP VI, P9992
  • [17] RGBD Salient Object Detection: A Benchmark and Algorithms
    Peng, Houwen
    Li, Bing
    Xiong, Weihua
    Hu, Weiming
    Ji, Rongrong
    [J]. COMPUTER VISION - ECCV 2014, PT III, 2014, 8691 : 92 - 109
  • [18] Perazzi F, 2012, PROC CVPR IEEE, P733, DOI 10.1109/CVPR.2012.6247743
  • [19] Depth-induced Multi-scale Recurrent Attention Network for Saliency Detection
    Piao, Yongri
    Ji, Wei
    Li, Jingjing
    Zhang, Miao
    Lu, Huchuan
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 7253 - 7262
  • [20] Active SLAM using 3D Submap Saliency for Underwater Volumetric Exploration
    Suresh, Sudharshan
    Sodhi, Paloma
    Mangelson, Joshua G.
    Wettergreen, David
    Kaess, Michael
    [J]. 2020 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA), 2020, : 3132 - 3138