MMNet: Multi-Stage and Multi-Scale Fusion Network for RGB-D Salient Object Detection

被引：35

作者：

Liao, Guibiao ^{[1
]}

Gao, Wei ^{[1
]}

Jiang, Qiuping ^{[2
]}

Wang, Ronggang ^{[1
]}

Li, Ge ^{[1
]}

机构：

[1] Peking Univ, Beijing, Peoples R China

[2] Ningbo Univ, Ningbo, Zhejiang, Peoples R China

来源：

MM '20: PROCEEDINGS OF THE 28TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA | 2020年

关键词：

Salient object detection; RGB-D image; cross-modal guided attention; adversarial combination; IMAGE; SEGMENTATION;

D O I：

10.1145/3394171.3413523

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Most existing RGB-D salient object detection (SOD) methods directly extract and fuse raw features from RGB and depth backbones. Such methods can be easily restricted by low-quality depth maps and redundant cross-modal features. To effectively capture multi-scale cross-modal fusion features, this paper proposes a novel Multi-stage and Multi-Scale Fusion Network (MMNet), which consists of a cross-modal multi-stage fusion module (CMFM) and a bi-directional multi-scale decoder (BMD). Similar to the mechanism of visual color stage doctrine in human visual system, the proposed CMFM aims to explore the useful and important feature representations in feature response stage, and effectively integrate them into available cross-modal fusion features in adversarial combination stage. Moreover, the proposed BMD learns the combination of cross-modal fusion features from multiple levels to capture both local and global information of salient objects and further reasonably boost the performance of the proposed method. Comprehensive experiments demonstrate that the proposed method can achieve consistently superior performance over the other 14 state-of-the-art methods on six popular RGB-D datasets when evaluated by 8 different metrics.

引用

页码：2436 / 2444

页数：9

共 41 条

[1]

Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596

[2]

[Anonymous], 2016, Proceedings of the IEEE conference on computer vision and pattern recognition, DOI DOI 10.1109/CVPR.2016.257

[3]

[Anonymous], 2020, P IEEE C COMP VIS PA, DOI DOI 10.1109/TFUZZ.2019.2930492

[4] Salient Object Detection: A Benchmark [J].