RGB-D Salient Object Detection with Cross-Modality Modulation and Selection

被引：95

作者：

Li, Chongyi ^{[1
]}

Cong, Runmin ^{[2
]}

Piao, Yongri ^{[3
]}

Xu, Qianqian ^{[4
]}

Loy, Chen Change ^{[1
]}

机构：

[1] Nanyang Technol Univ, Singapore, Singapore

[2] Beijing Jiaotong Univ, Beijing, Peoples R China

[3] Dalian Univ Technol, Dalian, Peoples R China

[4] Chinese Acad Sci, Inst Comp Technol, Beijing, Peoples R China

来源：

COMPUTER VISION - ECCV 2020, PT VIII | 2020年 / 12353卷

基金：

中国博士后科学基金;

关键词：

NETWORK; FUSION;

D O I：

10.1007/978-3-030-58598-3_14

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present an effective method to progressively integrate and refine the cross-modality complementarities for RGB-D salient object detection (SOD). The proposed network mainly solves two challenging issues: 1) how to effectively integrate the complementary information from RGB image and its corresponding depth map, and 2) how to adaptively select more saliency-related features. First, we propose a cross-modality feature modulation (cmFM) module to enhance feature representations by taking the depth features as prior, which models the complementary relations of RGB-D data. Second, we propose an adaptive feature selection (AFS) module to select saliency-related features and suppress the inferior ones. The AFS module exploits multi-modality spatial feature fusion with the self-modality and cross-modality interdependencies of channel features are considered. Third, we employ a saliency-guided position-edge attention (sg-PEA) module to encourage our network to focus more on saliency-related regions. The above modules as a whole, called cmMS block, facilitates the refinement of saliency features in a coarse-to-fine fashion. Coupled with a bottom-up inference, the refined saliency features enable accurate and edge-preserving SOD. Extensive experiments demonstrate that our network outperforms state-of-the-art saliency detectors on six popular RGB-D SOD benchmarks.

引用

页码：225 / 241

页数：17

共 48 条

[1] Salient Object Detection: A Benchmark [J].