MSEDNet: Multi-scale fusion and edge-supervised network for RGB-T salient object detection

被引:13
作者
Peng, Daogang [1 ]
Zhou, Weiyi [1 ]
Pan, Junzhen [1 ]
Wang, Danhao [1 ]
机构
[1] Shanghai Univ Elect Power, Coll Automat Engn, 2588 Changyang Rd, Shanghai 200090, Peoples R China
关键词
RGB-T; Salient object detection; Multi-scale fusion; Edge fusion loss; SEGMENTATION;
D O I
10.1016/j.neunet.2023.12.031
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
RGB-T Salient object detection (SOD) is to accurately segment salient regions in both visible light images and thermal infrared images. However, most of existing methods for SOD neglects the critical complementarity between multiple modalities images, which is beneficial to further improve the detection accuracy. Therefore, this work introduces the MSEDNet RGB-T SOD method. We utilize an encoder to extract multi-level modalities features from both visible light images and thermal infrared images, which are subsequently categorized into high, medium, and low level. Additionally, we propose three separate feature fusion modules to comprehensively extract complementary information between different modalities during the fusion process. These modules are applied to specific feature levels: the Edge Dilation Sharpening module for low-level features, the Spatial and Channel-Aware module for mid-level features, and the Cross-Residual Fusion module for high-level features. Finally, we introduce an edge fusion loss function for supervised learning, which effectively extracts edge information from different modalities and suppresses background noise. Comparative demonstrate the superiority of the proposed MSEDNet over other state-of-the-art methods. The code and results can be found at the following link: https://github.com/Zhou-wy/MSEDNet.
引用
收藏
页码:410 / 422
页数:13
相关论文
共 67 条
  • [1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
  • [2] Salient Object Detection: A Benchmark
    Borji, Ali
    Cheng, Ming-Ming
    Jiang, Huaizu
    Li, Jia
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5706 - 5722
  • [3] Person Re-Identification via Attention Pyramid
    Chen, Guangyi
    Gu, Tianpei
    Lu, Jiwen
    Bao, Jin-An
    Zhou, Jie
    [J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7663 - 7676
  • [4] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
    Chen, Hao
    Li, Youfu
    Su, Dan
    [J]. PATTERN RECOGNITION, 2019, 86 : 376 - 385
  • [5] Chen Tao, 2022, IEEE Transactions on Multimedia
  • [6] BING: Binarized Normed Gradients for Objectness Estimation at 300fps
    Cheng, Ming-Ming
    Zhang, Ziming
    Lin, Wen-Yan
    Torr, Philip
    [J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3286 - 3293
  • [7] Class-Balanced Loss Based on Effective Number of Samples
    Cui, Yin
    Jia, Menglin
    Lin, Tsung-Yi
    Song, Yang
    Belongie, Serge
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9260 - 9269
  • [8] Deng ZJ, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P684
  • [9] Fan DP, 2018, Arxiv, DOI arXiv:1805.10421
  • [10] Salient Objects in Clutter
    Fan, Deng-Ping
    Zhang, Jing
    Xu, Gang
    Cheng, Ming-Ming
    Shao, Ling
    [J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2344 - 2366