MSEDNet: Multi-scale fusion and edge-supervised network for RGB-T salient object detection

被引：13

作者：

Peng, Daogang ^{[1
]}

Zhou, Weiyi ^{[1
]}

Pan, Junzhen ^{[1
]}

Wang, Danhao ^{[1
]}

机构：

[1] Shanghai Univ Elect Power, Coll Automat Engn, 2588 Changyang Rd, Shanghai 200090, Peoples R China

来源：

NEURAL NETWORKS | 2024年 / 171卷

关键词：

RGB-T; Salient object detection; Multi-scale fusion; Edge fusion loss; SEGMENTATION;

D O I：

10.1016/j.neunet.2023.12.031

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

RGB-T Salient object detection (SOD) is to accurately segment salient regions in both visible light images and thermal infrared images. However, most of existing methods for SOD neglects the critical complementarity between multiple modalities images, which is beneficial to further improve the detection accuracy. Therefore, this work introduces the MSEDNet RGB-T SOD method. We utilize an encoder to extract multi-level modalities features from both visible light images and thermal infrared images, which are subsequently categorized into high, medium, and low level. Additionally, we propose three separate feature fusion modules to comprehensively extract complementary information between different modalities during the fusion process. These modules are applied to specific feature levels: the Edge Dilation Sharpening module for low-level features, the Spatial and Channel-Aware module for mid-level features, and the Cross-Residual Fusion module for high-level features. Finally, we introduce an edge fusion loss function for supervised learning, which effectively extracts edge information from different modalities and suppresses background noise. Comparative demonstrate the superiority of the proposed MSEDNet over other state-of-the-art methods. The code and results can be found at the following link: https://github.com/Zhou-wy/MSEDNet.

引用

页码：410 / 422

页数：13

共 67 条

[41] Video Salient Object Detection Using Spatiotemporal Deep Features
Trung-Nghia Le
Sugimoto, Akihiro
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (10) : 5002 - 5015
[42] Tu Y, 2020, AAAI CONF ARTIF INTE, V34, P12104
[43] Tu Z., 2022, IEEE Transactions on Multimedia
[44] Multi-Interactive Dual-Decoder for RGB-Thermal Salient Object Detection
Tu, Zhengzheng
Li, Zhun
Li, Chenglong
Lang, Yang
Tang, Jin
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 5678 - 5691
[45] RGB-T Image Saliency Detection via Collaborative Graph Learning
Tu, Zhengzheng
Xia, Tian
Li, Chenglong
Wang, Xiaoxiao
Ma, Yan
Tang, Jin
[J]. IEEE TRANSACTIONS ON MULTIMEDIA, 2020, 22 (01) : 160 - 173
[46] Wang Guizhao, 2018, Image and Graphics Technologies and Applications: 13th Conference on Image and Graphics Technologies and Applications, IGTA 2018, Beijing, China, April 8-10, 2018, Revised Selected Papers. Communications in Computer and Information Science (875), P359, DOI 10.1007/978-981-13-1702-6_36
[47] Cross-modality paired-images generation and augmentation for RGB-infrared person re-identification
Wang, Guan'an
Yang, Yang
Zhang, Tianzhu
Cheng, Jian
Hou, Zengguang
Tiwari, Prayag
Pandey, Hari Mohan
[J]. NEURAL NETWORKS, 2020, 128 : 294 - 304
[48] CGFNet: Cross-Guided Fusion Network for RGB-T Salient Object Detection
Wang, Jie
Song, Kechen
Bao, Yanqi
Huang, Liming
Yan, Yunhui
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (05) : 2949 - 2961
[49] Overview of deep-learning based methods for salient object detection in videos
Wang, Qiong
Zhang, Lu
Li, Yan
Kpalma, Kidiyo
[J]. PATTERN RECOGNITION, 2020, 104 (104)
[50] Video Salient Object Detection via Fully Convolutional Networks
Wang, Wenguan
Shen, Jianbing
Shao, Ling
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (01) : 38 - 49

← 1 2 3 4 5 6 7 →