MSEDNet: Multi-scale fusion and edge-supervised network for RGB-T salient object detection

被引：13

作者：

Peng, Daogang ^{[1
]}

Zhou, Weiyi ^{[1
]}

Pan, Junzhen ^{[1
]}

Wang, Danhao ^{[1
]}

机构：

[1] Shanghai Univ Elect Power, Coll Automat Engn, 2588 Changyang Rd, Shanghai 200090, Peoples R China

来源：

NEURAL NETWORKS | 2024年 / 171卷

关键词：

RGB-T; Salient object detection; Multi-scale fusion; Edge fusion loss; SEGMENTATION;

D O I：

10.1016/j.neunet.2023.12.031

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

RGB-T Salient object detection (SOD) is to accurately segment salient regions in both visible light images and thermal infrared images. However, most of existing methods for SOD neglects the critical complementarity between multiple modalities images, which is beneficial to further improve the detection accuracy. Therefore, this work introduces the MSEDNet RGB-T SOD method. We utilize an encoder to extract multi-level modalities features from both visible light images and thermal infrared images, which are subsequently categorized into high, medium, and low level. Additionally, we propose three separate feature fusion modules to comprehensively extract complementary information between different modalities during the fusion process. These modules are applied to specific feature levels: the Edge Dilation Sharpening module for low-level features, the Spatial and Channel-Aware module for mid-level features, and the Cross-Residual Fusion module for high-level features. Finally, we introduce an edge fusion loss function for supervised learning, which effectively extracts edge information from different modalities and suppresses background noise. Comparative demonstrate the superiority of the proposed MSEDNet over other state-of-the-art methods. The code and results can be found at the following link: https://github.com/Zhou-wy/MSEDNet.

引用

页码：410 / 422

页数：13

共 67 条

[1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2] Salient Object Detection: A Benchmark
Borji, Ali
Cheng, Ming-Ming
Jiang, Huaizu
Li, Jia
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5706 - 5722
[3] Person Re-Identification via Attention Pyramid
Chen, Guangyi
Gu, Tianpei
Lu, Jiwen
Bao, Jin-An
Zhou, Jie
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7663 - 7676
[4] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
Chen, Hao
Li, Youfu
Su, Dan
[J]. PATTERN RECOGNITION, 2019, 86 : 376 - 385
[5] Chen Tao, 2022, IEEE Transactions on Multimedia
[6] BING: Binarized Normed Gradients for Objectness Estimation at 300fps
Cheng, Ming-Ming
Zhang, Ziming
Lin, Wen-Yan
Torr, Philip
[J]. 2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, : 3286 - 3293
[7] Class-Balanced Loss Based on Effective Number of Samples
Cui, Yin
Jia, Menglin
Lin, Tsung-Yi
Song, Yang
Belongie, Serge
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 9260 - 9269
[8] Deng ZJ, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P684
[9] Fan DP, 2018, Arxiv, DOI arXiv:1805.10421
[10] Salient Objects in Clutter
Fan, Deng-Ping
Zhang, Jing
Xu, Gang
Cheng, Ming-Ming
Shao, Ling
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2023, 45 (02) : 2344 - 2366

← 1 2 3 4 5 6 7 →