Data-Level Recombination and Lightweight Fusion Scheme for RGB-D Salient Object Detection

被引:72
作者
Wang, Xuehao [1 ,2 ]
Li, Shuai [1 ,2 ]
Chen, Chenglizhao [1 ,2 ,3 ]
Fang, Yuming [4 ]
Hao, Aimin [1 ,2 ]
Qin, Hong [5 ]
机构
[1] Beihang Univ, Qingdao Res Inst, Beijing 100083, Peoples R China
[2] Beihang Univ, State Key Lab VRTS, Beijing 100191, Peoples R China
[3] Qingdao Univ, Coll Comp Sci & Technol, Qingdao 266071, Peoples R China
[4] Jiangxi Univ Finance & Econ, Informat Coll, Nanchang 330013, Jiangxi, Peoples R China
[5] SUNY Stony Brook, Comp Sci Dept, Stony Brook, NY 11794 USA
基金
中国国家自然科学基金; 美国国家科学基金会;
关键词
Object detection; Feature extraction; Semantics; Deep learning; Estimation; Training; Network architecture; RGB-D saliency detection; data-level fusion; lightweight designed triple-stream network; IMAGE; NETWORK; MODEL;
D O I
10.1109/TIP.2020.3037470
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Existing RGB-D salient object detection methods treat depth information as an independent component to complement RGB and widely follow the bistream parallel network architecture. To selectively fuse the CNN features extracted from both RGB and depth as a final result, the state-of-the-art (SOTA) bistream networks usually consist of two independent subbranches: one subbranch is used for RGB saliency, and the other aims for depth saliency. However, depth saliency is persistently inferior to the RGB saliency because the RGB component is intrinsically more informative than the depth component. The bistream architecture easily biases its subsequent fusion procedure to the RGB subbranch, leading to a performance bottleneck. In this paper, we propose a novel data-level recombination strategy to fuse RGB with D (depth) before deep feature extraction, where we cyclically convert the original 4-dimensional RGB-D into DGB, RDB and RGD. Then, a newly lightweight designed triple-stream network is applied over these novel formulated data to achieve an optimal channel-wise complementary fusion status between the RGB and D, achieving a new SOTA performance.
引用
收藏
页码:458 / 471
页数:14
相关论文
共 65 条
[1]  
[Anonymous], 2020, ARXIV200300651
[2]   Saliency-Based Selection of Gradient Vector Flow Paths for Content Aware Image Resizing [J].
Battiato, Sebastiano ;
Farinella, Giovanni Maria ;
Puglisi, Giovanni ;
Ravi, Daniele .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2014, 23 (05) :2081-2095
[3]  
Chen C., 2014, IEEE T IMAGE PROCESS, V29, P4296
[4]   Improved Robust Video Saliency Detection Based on Long-Term Spatial-Temporal Information [J].
Chen, Chenglizhao ;
Wang, Guotao ;
Peng, Chong ;
Zhang, Xiaowei ;
Qin, Hong .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 :1090-1100
[5]   Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion [J].
Chen, Chenglizhao ;
Li, Shuai ;
Wang, Yongguang ;
Qin, Hong ;
Hao, Aimin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) :3156-3170
[6]   Robust salient motion detection in non-stationary videos via novel integrated strategies of spatio-temporal coherency clues and low-rank analysis [J].
Chen, Chenglizhao ;
Li, Shuai ;
Qin, Hong ;
Hao, Aimin .
PATTERN RECOGNITION, 2016, 52 :410-432
[7]   Real-time and robust object tracking in video via low-rank coherency analysis in feature space [J].
Chen, Chenglizhao ;
Li, Shuai ;
Qin, Hong ;
Hao, Aimin .
PATTERN RECOGNITION, 2015, 48 (09) :2885-2905
[8]   Structure-Sensitive Saliency Detection via Multilevel Rank Analysis in Intrinsic Feature Space [J].
Chen, Chenglizhao ;
Li, Shuai ;
Qin, Hong ;
Hao, Aimin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2015, 24 (08) :2303-2316
[9]   Three-Stream Attention-Aware Network for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2019, 28 (06) :2825-2835
[10]   Discriminative Cross-Modal Transfer Learning and Densely Cross-Level Feedback Fusion for RGB-D Salient Object Detection [J].
Chen, Hao ;
Li, Youfu ;
Su, Dan .
IEEE TRANSACTIONS ON CYBERNETICS, 2020, 50 (11) :4808-4820