Lightweight Multi-modal Representation Learning for RGB Salient Object Detection

被引：1

作者：

Xiao, Yun ^{[1
,2
,4
]}

Huang, Yameng ^{[3
]}

Li, Chenglong ^{[1
,2
,4
]}

Liu, Lei ^{[3
]}

Zhou, Aiwu ^{[3
]}

Tang, Jin ^{[3
]}

机构：

[1] Informat Mat & Intelligent Sensing Lab Anhui Prov, Hefei 230601, Peoples R China

[2] Anhui Prov Key Lab Multimodal Cognit Computat, Hefei 230601, Peoples R China

[3] Anhui Univ, Sch Comp Sci & Technol, Hefei 230601, Peoples R China

[4] Anhui Univ, Sch Artificial Intelligence, Hefei 230601, Peoples R China

来源：

COGNITIVE COMPUTATION | 2023年 / 15卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Salient object detection; Depth estimation; Lightweight network; Multi-modal representation learning; NETWORK;

D O I：

10.1007/s12559-023-10148-1

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The task of salient object detection (SOD) often faces various challenges such as complex backgrounds and low appearance contrast. Depth information, which reflects the geometric shape of an object's surface, can be used as a supplement to visible information and receives increasing interest in SOD. However, depth sensors suffer from limited conditions and range (e.g., 4-5 ms at most in indoor scenes), and the imaging quality is usually low. We design a lightweight network in order to infer depth features while reducing computational complexities, which only needs a few parameters to effectively capture depth-specific features by fusing high-level features from the RGB modality. Both RGB features and inferred depth features might contain noises, and thus we design a fusion network, which includes a self-attention-based feature interaction module and a foreground-background enhancement module, to achieve an adaptive fusion of RGB and depth features. In addition, we introduce a multi-scale fusion module with different dilated convolutions to leverage useful local and global context clues. Experimental results on five benchmark datasets show that our approach significantly outperforms the state-of-the-art RGBD SOD methods, and also performs comparably against the state-of-the-art RGB SOD methods. The experimental results show that our multi-modal representation learning method can deal with the imaging limitations of single-modality data for RGB salient object detection, and the experimental results on multiple RGBD and RGB SOD datasets illustrate the effectiveness of our method.

引用

页码：1868 / 1883

页数：16

共 70 条

[1] Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[2] Bhat Goutam, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12347), P777, DOI 10.1007/978-3-030-58536-5_46
[3] Salient Object Detection: A Benchmark
Borji, Ali
Cheng, Ming-Ming
Jiang, Huaizu
Li, Jia
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (12) : 5706 - 5722
[4] CNN-Based RGB-D Salient Object Detection: Learn, Select, and Fuse
Chen, Hao
Li, Youfu
Deng, Yongjian
Lin, Guosheng
[J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2021, 129 (07) : 2076 - 2096
[5] Multi-modal fusion network with multi-scale multi-path and cross-modal interactions for RGB-D salient object detection
Chen, Hao
Li, Youfu
Su, Dan
[J]. PATTERN RECOGNITION, 2019, 86 : 376 - 385
[6] DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
Chen, Liang-Chieh
Papandreou, George
Kokkinos, Iasonas
Murphy, Kevin
Yuille, Alan L.
[J]. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2018, 40 (04) : 834 - 848
[7] Reverse Attention for Salient Object Detection
Chen, Shuhan
Tan, Xiuli
Wang, Ben
Hu, Xuelong
[J]. COMPUTER VISION - ECCV 2018, PT IX, 2018, 11213 : 236 - 252
[8] DPANet: Depth Potentiality-Aware Gated Attention Network for RGB-D Salient Object Detection
Chen, Zuyao
Cong, Runmin
Xu, Qianqian
Huang, Qingming
[J]. IEEE TRANSACTIONS ON IMAGE PROCESSING, 2021, 30 : 7012 - 7024
[9] Chongyi Li, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12353), P225, DOI 10.1007/978-3-030-58598-3_14
[10] Deng ZJ, 2018, PROCEEDINGS OF THE TWENTY-SEVENTH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P684

← 1 2 3 4 5 6 7 →