Learning Adaptive Fusion Bank for Multi-Modal Salient Object Detection

被引:9
|
作者
Wang, Kunpeng [1 ,2 ]
Tu, Zhengzheng [1 ,2 ]
Li, Chenglong [3 ,4 ]
Zhang, Cheng [1 ,2 ]
Luo, Bin [1 ,2 ]
机构
[1] Anhui Univ, Informat Mat & Intelligent Sensing Lab Anhui Prov, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
[3] Anhui Univ, Sch Artificial Intelligence, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
[4] Anhui Univ, Inst Phys Sci & Informat Technol, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Lighting; Object detection; Clutter; Circuits and systems; Semantics; Data mining; Salient object detection (SOD); adaptive fusion bank; indirect interactive guidance; NETWORK; REFINEMENT;
D O I
10.1109/TCSVT.2024.3375505
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multi-modal salient object detection (MSOD) aims to boost saliency detection performance by integrating visible sources with depth or thermal infrared ones. Existing methods generally design different fusion schemes to handle certain issues or challenges. Although these fusion schemes are effective at addressing specific issues or challenges, they may struggle to handle multiple complex challenges simultaneously. To solve this problem, we propose a novel adaptive fusion bank that makes full use of the complementary benefits from a set of basic fusion schemes to handle different challenges simultaneously for robust MSOD. We focus on handling five major challenges in MSOD, namely center bias, scale variation, image clutter, low illumination, and thermal crossover or depth ambiguity. The fusion bank proposed consists of five representative fusion schemes, which are specifically designed based on the characteristics of each challenge, respectively. The bank is scalable, and more fusion schemes could be incorporated into the bank for more challenges. To adaptively select the appropriate fusion scheme for multi-modal input, we introduce an adaptive ensemble module that forms the adaptive fusion bank, which is embedded into hierarchical layers for sufficient fusion of different source data. Moreover, we design an indirect interactive guidance module to accurately detect salient hollow objects via the skip integration of high-level semantic information and low-level spatial details. Extensive experiments on three RGBT datasets and seven RGBD datasets demonstrate that the proposed method achieves the outstanding performance compared to the state-of-the-art methods.
引用
收藏
页码:7344 / 7358
页数:15
相关论文
共 50 条
  • [21] Multi-Modal Fusion for Moving Object Detection in Static and Complex Backgrounds
    Jiang, Huali
    Li, Xin
    TRAITEMENT DU SIGNAL, 2023, 40 (05) : 1941 - 1950
  • [22] Imagery in multi-modal object learning
    Jüttner, M
    Rentschler, I
    BEHAVIORAL AND BRAIN SCIENCES, 2002, 25 (02) : 197 - +
  • [23] Multi-modal disease segmentation with continual learning and adaptive decision fusion
    Xu, Xu
    Chen, Junxin
    Thakur, Dipanwita
    Hong, Duo
    INFORMATION FUSION, 2025, 118
  • [24] Adaptive cross-fusion learning for multi-modal gesture recognition
    Benjia ZHOU
    Jun WAN
    Yanyan LIANG
    Guodong GUO
    虚拟现实与智能硬件(中英文), 2021, 3 (03) : 235 - 247
  • [25] Adaptive cross-fusion learning for multi-modal gesture recognition
    Zhou, Benjia
    Wan, Jun
    Liang, Yanyan
    Guo, Guodong
    Virtual Reality and Intelligent Hardware, 2021, 3 (03): : 235 - 247
  • [26] MEANet: Multi-modal edge-aware network for light field salient object detection
    Jiang, Yao
    Zhang, Wenbo
    Fu, Keren
    Zhao, Qijun
    NEUROCOMPUTING, 2022, 491 : 78 - 90
  • [27] A Quantitative Validation of Multi-Modal Image Fusion and Segmentation for Object Detection and Tracking
    LaHaye, Nicholas
    Garay, Michael J.
    Bue, Brian D.
    El-Askary, Hesham
    Linstead, Erik
    REMOTE SENSING, 2021, 13 (12)
  • [28] UAMFDet: Acoustic-Optical Fusion for Underwater Multi-Modal Object Detection
    Chen, Haojie
    Wang, Zhuo
    Qin, Hongde
    Mu, Xiaokai
    JOURNAL OF FIELD ROBOTICS, 2024,
  • [29] Addressing uncertainty in multi-modal fusion for improved object detection in dynamic environment
    Kumar, Praveen
    Mittal, Ankush
    Kumar, Padam
    INFORMATION FUSION, 2010, 11 (04) : 311 - 324
  • [30] ObjectFusion: Multi-modal 3D Object Detection with Object-Centric Fusion
    Cai, Qi
    Pan, Yingwei
    Yao, Ting
    Ngo, Chong-Wah
    Mei, Tao
    2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 18021 - 18030