BMFNet: Bifurcated multi-modal fusion network for RGB-D salient object detection

被引:2
作者
Sun, Chenwang [1 ]
Zhang, Qing [1 ]
Zhuang, Chenyu [1 ]
Zhang, Mingqian [2 ]
机构
[1] Shanghai Inst Technol, Sch Comp Sci & Informat Engn, Shanghai 201418, Peoples R China
[2] Shanghai Inst Technol, Sch Mech Engn, Shanghai 201418, Peoples R China
基金
上海市自然科学基金;
关键词
RGB-D salient object detection; Cross-modal fusion; Multi-modal integration; Multi-level aggregation; IMAGE;
D O I
10.1016/j.imavis.2024.105048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although deep learning-based RGB-D salient object detection methods have achieved impressive results in the recent years, there are still some issues need to be addressed including multi-modal fusion and multi-level aggregation. In this paper, we propose a bifurcated multi-modal fusion network (BMFNet) to address these two issues cooperatively. First, we design a multi-modal feature interaction (MFI) module to fully capture the complementary information between the RGB and depth features by leveraging the channel attention and spatial attention. Second, unlike the widely used layer-by-layer progressive fusion, we adopt a bifurcated fusion strategy for all the multi-level unimodal and cross-modal features to effectively reduce the gaps between features at different levels. For the intra-group feature aggregation, a multi-modal feature fusion (MFF) module is designed to integrate the intra-group multi-modal features to produce a low-level/high-level saliency feature. For the inter-group aggregation, a multi-scale feature learning (MFL) module is introduced to exploit the contextual interactions between different scales to boost fusion performance. Experimental results on five public RGB-D datasets demonstrate the effectiveness and superiority of our proposed network. The code and prediction maps will be available at https://github.com/ZhangQing0329/BMFNet
引用
收藏
页数:15
相关论文
共 82 条
  • [1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
  • [2] Ao Luo, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12357), P346, DOI 10.1007/978-3-030-58610-2_21
  • [3] Depth-Quality-Aware Salient Object Detection
    Chen, Chenglizhao
    Wei, Jipeng
    Peng, Chong
    Qin, Hong
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 2350 - 2363
  • [4] Progressively Complementarity-aware Fusion Network for RGB-D Salient Object Detection
    Chen, Hao
    Li, Youfu
    [J]. 2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, : 3051 - 3060
  • [5] Chen Q, 2021, AAAI CONF ARTIF INTE, V35, P1063
  • [6] Adaptive fusion network for RGB-D salient object detection
    Chen, Tianyou
    Xiao, Jin
    Hu, Xiaoguang
    Zhang, Guofeng
    Wang, Shaojie
    [J]. NEUROCOMPUTING, 2023, 522 : 152 - 164
  • [7] DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection
    Chen, Zuyao
    Cong, Runmin
    Xu, Qianqian
    Huang, Qingming
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7012 - 7024
  • [8] Chen ZY, 2020, AAAI CONF ARTIF INTE, V34, P10599
  • [9] Global Contrast based Salient Region Detection
    Cheng, Ming-Ming
    Zhang, Guo-Xin
    Mitra, Niloy J.
    Huang, Xiaolei
    Hu, Shi-Min
    [J]. 2011 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2011, : 409 - 416
  • [10] Cheng Y, 2014, IEEE INT CON MULTI