Multi-scale Spatial-Temporal Feature Aggregating for Video Salient Object Segmentation

被引:0
作者
Mu, Changhong [1 ]
Yuan, Zebin [1 ]
Ouyang, Xiuqin [1 ]
Wang, Bo [1 ]
机构
[1] Soochow Univ, Golden Mantis Inst Architecture, Suzhou, Peoples R China
来源
2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019) | 2019年
关键词
video saliency; salient object segmentation; spatial temporal information; deep learning;
D O I
10.1109/siprocess.2019.8868427
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
This paper proposes an algorithm based on supervised deep convolutional neural networks (CNNs), which fully extracts and fuses spatial-temporal information of frames to enhance the video saliency detection performance. Conventional video saliency detection methods that exist some problems (e.g. cumulative miscellaneous information, acquisition of spatial-temporal information separately, time-consuming algorithm, etc.), not only cannot make the spatial temporal information fully integrated, but cannot meet the real-time requirements. A Multi-scale Spatial Feature Extraction Module (MSFEM) based on deep learning is first designed for simultaneously extracting spatial features at multiple scales. Then, we further fuse the spatial-temporal features of the frames by adopting Multi-scale Spatial-Temporal Feature Refine Module (MSTFRM) to take full advantage of spatial-temporal information in order to achieve high-quality detections with strong spatial-temporal saliency consistency. After that, the network adopts end-to-end network training and testing to avoid unnecessary time overhead caused by pretreatment. To validate the method, we make comprehensive, quantitative evaluations between our method and 8 state-of-the-art techniques. All the results demonstrate our method's advantages in terms of accuracy and reliability.
引用
收藏
页码:224 / 229
页数:6
相关论文
共 22 条
[1]  
Chen C., 2018, IEEE T MULTIMEDIA
[2]   Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion [J].
Chen, Chenglizhao ;
Li, Shuai ;
Wang, Yongguang ;
Qin, Hong ;
Hao, Aimin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) :3156-3170
[3]  
Fukuchi K, 2009, IEEE INT CON MULTI, P638, DOI 10.1109/ICME.2009.5202577
[4]  
He K., 2016, CVPR, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]
[5]   Deeply Supervised Salient Object Detection with Short Connections [J].
Hou, Qibin ;
Cheng, Ming-Ming ;
Hu, Xiaowei ;
Borji, Ali ;
Tu, Zhuowen ;
Torr, Philip .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5300-5309
[6]  
Jin W., 2019, IEEE T AUTOMATION SC
[7]   Video Segmentation by Tracking Many Figure-Ground Segments [J].
Li, Fuxin ;
Kim, Taeyoung ;
Humayun, Ahmad ;
Tsai, David ;
Rehg, James M. .
2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :2192-2199
[8]   Instance-Level Salient Object Segmentation [J].
Li, Guanbin ;
Xie, Yuan ;
Lin, Liang ;
Yu, Yizhou .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :247-256
[9]   Deep Contrast Learning for Salient Object Detection [J].
Li, Guanbin ;
Yu, Yizhou .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :478-487
[10]  
Li GB, 2015, PROC CVPR IEEE, P5455, DOI 10.1109/CVPR.2015.7299184