BMFNet: Bifurcated multi-modal fusion network for RGB-D salient object detection

被引:2
|
作者
Sun, Chenwang [1 ]
Zhang, Qing [1 ]
Zhuang, Chenyu [1 ]
Zhang, Mingqian [2 ]
机构
[1] Shanghai Inst Technol, Sch Comp Sci & Informat Engn, Shanghai 201418, Peoples R China
[2] Shanghai Inst Technol, Sch Mech Engn, Shanghai 201418, Peoples R China
基金
上海市自然科学基金;
关键词
RGB-D salient object detection; Cross-modal fusion; Multi-modal integration; Multi-level aggregation; IMAGE;
D O I
10.1016/j.imavis.2024.105048
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Although deep learning-based RGB-D salient object detection methods have achieved impressive results in the recent years, there are still some issues need to be addressed including multi-modal fusion and multi-level aggregation. In this paper, we propose a bifurcated multi-modal fusion network (BMFNet) to address these two issues cooperatively. First, we design a multi-modal feature interaction (MFI) module to fully capture the complementary information between the RGB and depth features by leveraging the channel attention and spatial attention. Second, unlike the widely used layer-by-layer progressive fusion, we adopt a bifurcated fusion strategy for all the multi-level unimodal and cross-modal features to effectively reduce the gaps between features at different levels. For the intra-group feature aggregation, a multi-modal feature fusion (MFF) module is designed to integrate the intra-group multi-modal features to produce a low-level/high-level saliency feature. For the inter-group aggregation, a multi-scale feature learning (MFL) module is introduced to exploit the contextual interactions between different scales to boost fusion performance. Experimental results on five public RGB-D datasets demonstrate the effectiveness and superiority of our proposed network. The code and prediction maps will be available at https://github.com/ZhangQing0329/BMFNet
引用
收藏
页数:15
相关论文
共 50 条
  • [21] MULTI-MODALITY DIVERSITY FUSION NETWORK WITH SWINTRANSFORMER FOR RGB-D SALIENT OBJECT DETECTION
    Duan, Songsong
    Xia, Chenxing
    Gao, Xiuju
    Ge, Bin
    Zhang, Hanling
    Li, Kuan-Ching
    2022 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, ICIP, 2022, : 1076 - 1080
  • [22] Multi-modality information refinement fusion network for RGB-D salient object detection
    Bao, Hua
    Fan, Bo
    VISUAL COMPUTER, 2024, 40 (06): : 4183 - 4199
  • [23] M3Net: Multi-scale Multi-path Multi-modal Fusion Network and Example Application to RGB-D Salient Object Detection
    Chen, Hao
    Li, You-Fu
    Su, Dan
    2017 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2017, : 4911 - 4916
  • [24] Multi-level cross-modal interaction network for RGB-D salient object detection
    Huang, Zhou
    Chen, Huai-Xin
    Zhou, Tao
    Yang, Yun-Zhi
    Liu, Bi-Yuan
    NEUROCOMPUTING, 2021, 452 : 200 - 211
  • [25] Heterogeneous Fusion and Integrity Learning Network for RGB-D Salient Object Detection
    Gao, Haorao
    Su, Yiming
    Wang, Fasheng
    Li, Haojie
    ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS, 2024, 20 (07)
  • [26] A cross-modal adaptive gated fusion generative adversarial network for RGB-D salient object detection
    Liu, Zhengyi
    Zhang, Wei
    Zhao, Peng
    NEUROCOMPUTING, 2020, 387 : 210 - 220
  • [27] Cross-modal hierarchical interaction network for RGB-D salient object detection
    Bi, Hongbo
    Wu, Ranwan
    Liu, Ziqi
    Zhu, Huihui
    Zhang, Cong
    Xiang, Tian -Zhu
    PATTERN RECOGNITION, 2023, 136
  • [28] Modal-Adaptive Gated Recoding Network for RGB-D Salient Object Detection
    Zhu, Jinchao
    Zhang, Xiaoyu
    Fang, Xian
    Dong, Feng
    Qiu, Yu
    IEEE SIGNAL PROCESSING LETTERS, 2022, 29 : 359 - 363
  • [29] CMA-SOD: cross-modal attention fusion network for RGB-D salient object detection
    Wang, Kexuan
    Liu, Chenhua
    Zhang, Rongfu
    VISUAL COMPUTER, 2024,
  • [30] MMNet: Multi-Stage and Multi-Scale Fusion Network for RGB-D Salient Object Detection
    Liao, Guibiao
    Gao, Wei
    Jiang, Qiuping
    Wang, Ronggang
    Li, Ge
    MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA, 2020, : 2436 - 2444