STA-Net: spatial-temporal attention network for video salient object detection

被引:29
作者
Bi, Hong-Bo [1 ]
Lu, Di [1 ]
Zhu, Hui-Hui [1 ]
Yang, Li-Na [1 ]
Guan, Hua-Ping [2 ]
机构
[1] NorthEast Petr Univ, Daqing, Peoples R China
[2] Fujian Normal Univ, Fuzhou, Peoples R China
关键词
Multi-scale; Video salient object detection; Attention; Pyramid; SEGMENTATION; OPTIMIZATION;
D O I
10.1007/s10489-020-01961-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper conducts a systematic study on the role of spatial and temporal attention mechanism in the video salient object detection (VSOD) task. We present a two-stage spatial-temporal attention network, named STA-Net, which makes two major contributions. In the first stage, we devise a Multi-Scale-Spatial-Attention (MSSA) module to reduce calculation cost on non-salient regions while exploiting multi-scale saliency information. Such a sliced attention method offers an individual way to efficiently exploit the high-level features of the network with an enlarged receptive field. The second stage is to propose a Pyramid-Saliency-Shift-Aware (PSSA) module, which puts emphasis on the importance of dynamic object information since it offers a valid shift cue to confirm salient object and capture temporal information. Such a temporal detection module is able to encourage precise salient region detection. Exhaustive experiments show that the proposed STA-Net is effective for video salient object detection task, and achieves compelling performance in comparison with state-of-the-art.
引用
收藏
页码:3450 / 3459
页数:10
相关论文
共 38 条
[1]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[2]  
Bi HB, 2019, IEEE IMAGE PROC, P4654, DOI [10.1109/ICIP.2019.8803611, 10.1109/icip.2019.8803611]
[3]   Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion [J].
Chen, Chenglizhao ;
Li, Shuai ;
Wang, Yongguang ;
Qin, Hong ;
Hao, Aimin .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) :3156-3170
[4]   Review of Development Status of Bi2Te3-Based Semiconductor Thermoelectric Power Generation [J].
Chen, Yan ;
Hou, Xiangnan ;
Ma, Chunyan ;
Dou, Yinke ;
Wu, Wentao .
ADVANCES IN MATERIALS SCIENCE AND ENGINEERING, 2018, 2018
[5]   SCOM: Spatiotemporal Constrained Optimization for Salient Object Detection [J].
Chen, Yuhuan ;
Zou, Wenbin ;
Tang, Yi ;
Li, Xia ;
Xu, Chen ;
Komodakis, Nikos .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (07) :3345-3357
[6]   Structure-measure: A New Way to Evaluate Foreground Maps [J].
Fan, Deng-Ping ;
Cheng, Ming-Ming ;
Liu, Yun ;
Li, Tao ;
Borji, Ali .
2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567
[7]   Shifting More Attention to Video Salient Object Detection [J].
Fan, Deng-Ping ;
Wang, Wenguan ;
Cheng, Ming-Ming ;
Shen, Jianbing .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8546-8556
[8]   Graph Construction for Salient Object Detection in Videos [J].
Fu, Keren ;
Gu, Irene Y. H. ;
Yun, Yixiao ;
Gong, Chen ;
Yang, Jie .
2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, :2371-2376
[9]  
Fukuchi K, 2009, IEEE INT CON MULTI, P638, DOI 10.1109/ICME.2009.5202577
[10]   Siamese attentional keypoint network for high performance visual tracking [J].
Gao, Peng ;
Yuan, Ruyue ;
Wang, Fei ;
Xiao, Liyi ;
Fujita, Hamido ;
Zhang, Yan .
KNOWLEDGE-BASED SYSTEMS, 2020, 193