Multi-scale Spatial-Temporal Feature Aggregating for Video Salient Object Segmentation

被引：0

作者：

Mu, Changhong ^{[1
]}

Yuan, Zebin ^{[1
]}

Ouyang, Xiuqin ^{[1
]}

Wang, Bo ^{[1
]}

机构：

[1] Soochow Univ, Golden Mantis Inst Architecture, Suzhou, Peoples R China

来源：

2019 IEEE 4TH INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP 2019) | 2019年

关键词：

video saliency; salient object segmentation; spatial temporal information; deep learning;

D O I：

10.1109/siprocess.2019.8868427

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

This paper proposes an algorithm based on supervised deep convolutional neural networks (CNNs), which fully extracts and fuses spatial-temporal information of frames to enhance the video saliency detection performance. Conventional video saliency detection methods that exist some problems (e.g. cumulative miscellaneous information, acquisition of spatial-temporal information separately, time-consuming algorithm, etc.), not only cannot make the spatial temporal information fully integrated, but cannot meet the real-time requirements. A Multi-scale Spatial Feature Extraction Module (MSFEM) based on deep learning is first designed for simultaneously extracting spatial features at multiple scales. Then, we further fuse the spatial-temporal features of the frames by adopting Multi-scale Spatial-Temporal Feature Refine Module (MSTFRM) to take full advantage of spatial-temporal information in order to achieve high-quality detections with strong spatial-temporal saliency consistency. After that, the network adopts end-to-end network training and testing to avoid unnecessary time overhead caused by pretreatment. To validate the method, we make comprehensive, quantitative evaluations between our method and 8 state-of-the-art techniques. All the results demonstrate our method's advantages in terms of accuracy and reliability.

引用

页码：224 / 229

页数：6

共 22 条

[1]

Chen C., 2018, IEEE T MULTIMEDIA

[2] Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion [J].

Chen, Chenglizhao ;

Li, Shuai ;

Wang, Yongguang ;

Qin, Hong ;

Hao, Aimin .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) :3156-3170

[3]

Fukuchi K, 2009, IEEE INT CON MULTI, P638, DOI 10.1109/ICME.2009.5202577

[4]

He K., 2016, CVPR, DOI [10.1109/CVPR.2016.90, DOI 10.1109/CVPR.2016.90]

[5] Deeply Supervised Salient Object Detection with Short Connections [J].

Hou, Qibin ;

Cheng, Ming-Ming ;

Hu, Xiaowei ;

Borji, Ali ;

Tu, Zhuowen ;

Torr, Philip .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5300-5309

[6]

Jin W., 2019, IEEE T AUTOMATION SC

[7] Video Segmentation by Tracking Many Figure-Ground Segments [J].

Li, Fuxin ;

Kim, Taeyoung ;

Humayun, Ahmad ;

Tsai, David ;

Rehg, James M. .

2013 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2013, :2192-2199

[8] Instance-Level Salient Object Segmentation [J].

Li, Guanbin ;

Xie, Yuan ;

Lin, Liang ;

Yu, Yizhou .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :247-256

[9] Deep Contrast Learning for Salient Object Detection [J].

Li, Guanbin ;

Yu, Yizhou .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :478-487

[10]

Li GB, 2015, PROC CVPR IEEE, P5455, DOI 10.1109/CVPR.2015.7299184

← 1 2 3 →