MFCINet: multi-level feature and context information fusion network for RGB-D salient object detection

被引：0

作者：

Chenxing Xia

Difeng Chen

Xiuju Gao

Bin Ge

Kuan-Ching Li

Xianjin Fang

Yan Zhang

Ke Yang

机构：

[1] Anhui University of Science and Technology,College of Computer Science and Engineering

[2] Hefei Comprehensive National Science Center,Institute of Energy

[3] Anhui Purvar Bigdata Technology Co. Ltd,College of Electrical and Information Engineering

[4] Anhui University of Science and Technology,Department of Computer Science and Information Engineering

[5] Providence University,The School of Electronics and Information Engineering

[6] Anhui University,undefined

来源：

The Journal of Supercomputing | 2024年 / 80卷

关键词：

Context semantic information; Cross-level features; Multi-level fusion; Salient object detection;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Recently, RGB-D salient object detection (SOD) has aroused widespread research interest. Existing methods tend to treat equally features at different levels and lead to inadequate interaction with cross-level features. Furthermore, many methods rely on the stacking of convolution layers or the use of dilated convolutions to increase the receptive field to extract high-level semantic features. However, these approaches may not effectively obtain context information, resulting in the loss of semantic information. In this paper, we propose a novel multi-level feature and context information fusion network (MFCINet) for RGB-D SOD, which mainly includes a detail enhancement fusion module (DEFM), semantic enhancement fusion module (SEFM), and multi-scale receptive field enhancement module (MREM). Concretely, we first design a detail enhancement fusion module (DEFM) and a semantic enhancement fusion module (SEFM) by introducing a combination of dual attention mechanisms to better fuse the rich details in low-level features and the rich semantic information in high-level features, respectively. Subsequently, a multi-scale receptive field enhancement module (MREM) is deployed to obtain the rich context semantic information in the network with the help of the parallel operation of convolution cores and skip connections, which are input into the subsequent dense connection pyramid decoder for SOD. Experimental results on five common datasets show that our model outperforms the 17 state-of-the-art (SOTA) methods.

引用

页码：2487 / 2513

页数：26

共 91 条

[1]

Tsai MF(2021)Enhancing the identification accuracy of deep learning object detection using natural language processing J Supercomput 77 6676-6691

[2]

Tseng HJ(2023)A strip dilated convolutional network for semantic segmentation Neural Process Lett 55 4439-4459

[3]

Zhou Y(2023)Question-guided feature pyramid network for medical visual question answering Expert Syst Appl 214 119148-126

[4]

Zheng X(2022)An empirical study of the impact of masks on face recognition Pattern Recogn 122 108308-211

[5]

Ouyang W(2021)Spatial context-aware network for salient object detection Pattern Recogn 114 107867-2581

[6]

Yu Y(2022)Aggregating dense and attentional multi-scale feature network for salient object detection Digit Signal Process 130 103747-1814

[7]

Li H(2015)Depth-aware salient object detection using anisotropic center-surround difference Signal Process: Image Commun 38 115-5613

[8]

Shi H(2021)Multi-level cross-modal interaction network for RGB-D salient object detection Neurocomputing 452 200-8742

[9]

Jeevan G(2022)HDNet: multi-modality hierarchy-aware decision network for RGB-D salient object detection IEEE Signal Process Lett 29 2577-471

[10]

Zacharias GC(2022)Guided residual network for RGB-D salient object detection with efficient depth feature learning Vis Comput 38 1803-384

← 1 2 3 4 5 6 7 8 9 10 →