Learning Adaptive Fusion Bank for Multi-Modal Salient Object Detection

被引:9
|
作者
Wang, Kunpeng [1 ,2 ]
Tu, Zhengzheng [1 ,2 ]
Li, Chenglong [3 ,4 ]
Zhang, Cheng [1 ,2 ]
Luo, Bin [1 ,2 ]
机构
[1] Anhui Univ, Informat Mat & Intelligent Sensing Lab Anhui Prov, Sch Comp Sci & Technol, Hefei 230601, Peoples R China
[2] Anhui Univ, Sch Comp Sci & Technol, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
[3] Anhui Univ, Sch Artificial Intelligence, Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China
[4] Anhui Univ, Inst Phys Sci & Informat Technol, Hefei 230601, Peoples R China
基金
中国国家自然科学基金;
关键词
Feature extraction; Lighting; Object detection; Clutter; Circuits and systems; Semantics; Data mining; Salient object detection (SOD); adaptive fusion bank; indirect interactive guidance; NETWORK; REFINEMENT;
D O I
10.1109/TCSVT.2024.3375505
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multi-modal salient object detection (MSOD) aims to boost saliency detection performance by integrating visible sources with depth or thermal infrared ones. Existing methods generally design different fusion schemes to handle certain issues or challenges. Although these fusion schemes are effective at addressing specific issues or challenges, they may struggle to handle multiple complex challenges simultaneously. To solve this problem, we propose a novel adaptive fusion bank that makes full use of the complementary benefits from a set of basic fusion schemes to handle different challenges simultaneously for robust MSOD. We focus on handling five major challenges in MSOD, namely center bias, scale variation, image clutter, low illumination, and thermal crossover or depth ambiguity. The fusion bank proposed consists of five representative fusion schemes, which are specifically designed based on the characteristics of each challenge, respectively. The bank is scalable, and more fusion schemes could be incorporated into the bank for more challenges. To adaptively select the appropriate fusion scheme for multi-modal input, we introduce an adaptive ensemble module that forms the adaptive fusion bank, which is embedded into hierarchical layers for sufficient fusion of different source data. Moreover, we design an indirect interactive guidance module to accurately detect salient hollow objects via the skip integration of high-level semantic information and low-level spatial details. Extensive experiments on three RGBT datasets and seven RGBD datasets demonstrate that the proposed method achieves the outstanding performance compared to the state-of-the-art methods.
引用
收藏
页码:7344 / 7358
页数:15
相关论文
共 50 条
  • [31] Multi-Modal ISAR Object Recognition using Adaptive Deep Relation Learning
    Xue, Bin
    Tong, Ningning
    2019 INTERNATIONAL CONFERENCE ON WIRELESS COMMUNICATIONS, SIGNAL PROCESSING AND NETWORKING (WISPNET 2019): ADVANCING WIRELESS AND MOBILE COMMUNICATIONS TECHNOLOGIES FOR 2020 INFORMATION SOCIETY, 2019, : 48 - 53
  • [32] Multi-modal Queried Object Detection in the Wild
    Xu, Yifan
    Zhang, Mengdan
    Fu, Chaoyou
    Chen, Peixian
    Yang, Xiaoshan
    Li, Ke
    Xu, Changsheng
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 36 (NEURIPS 2023), 2023,
  • [33] Citrus Huanglongbing Detection Based on Multi-Modal Feature Fusion Learning
    Yang, Dongzi
    Wang, Fengcheng
    Hu, Yuqi
    Lan, Yubin
    Deng, Xiaoling
    FRONTIERS IN PLANT SCIENCE, 2021, 12
  • [34] Balanced Multi-modal Learning with Hierarchical Fusion for Fake News Detection
    Wu, Fei
    Chen, Shu
    Gao, Guangwei
    Ji, Yimu
    Jing, Xiao-Yuan
    PATTERN RECOGNITION, 2025, 164
  • [35] Deep multi-scale and multi-modal fusion for 3D object detection
    Guo, Rui
    Li, Deng
    Han, Yahong
    PATTERN RECOGNITION LETTERS, 2021, 151 : 236 - 242
  • [36] On Multi-modal Fusion Learning in constraint propagation
    Li, Yaoyi
    Lu, Hongtao
    INFORMATION SCIENCES, 2018, 462 : 204 - 217
  • [37] Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
    Li, Xin
    Shi, Botian
    Hou, Yuenan
    Wu, Xingjiao
    Ma, Tianlong
    Li, Yikang
    He, Liang
    COMPUTER VISION, ECCV 2022, PT XXXVIII, 2022, 13698 : 691 - 707
  • [38] Multi-modal deep feature learning for RGB-D object detection
    Xu, Xiangyang
    Li, Yuncheng
    Wu, Gangshan
    Luo, Jiebo
    PATTERN RECOGNITION, 2017, 72 : 300 - 313
  • [39] Multi-modal object detection using unsupervised transfer learning and adaptation techniques
    Abbott, Rachael
    Robertson, Neil
    del Rincon, Jesus Martinez
    Connor, Barry
    ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING IN DEFENSE APPLICATIONS, 2019, 11169
  • [40] Deep learning based object detection from multi-modal sensors: an overview
    Ye Liu
    Shiyang Meng
    Hongzhang Wang
    Jun Liu
    Multimedia Tools and Applications, 2024, 83 : 19841 - 19870