STA-Net: spatial-temporal attention network for video salient object detection

被引：29

作者：

Bi, Hong-Bo ^{[1
]}

Lu, Di ^{[1
]}

Zhu, Hui-Hui ^{[1
]}

Yang, Li-Na ^{[1
]}

Guan, Hua-Ping ^{[2
]}

机构：

[1] NorthEast Petr Univ, Daqing, Peoples R China

[2] Fujian Normal Univ, Fuzhou, Peoples R China

来源：

APPLIED INTELLIGENCE | 2021年 / 51卷 / 06期

关键词：

Multi-scale; Video salient object detection; Attention; Pyramid; SEGMENTATION; OPTIMIZATION;

D O I：

10.1007/s10489-020-01961-4

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper conducts a systematic study on the role of spatial and temporal attention mechanism in the video salient object detection (VSOD) task. We present a two-stage spatial-temporal attention network, named STA-Net, which makes two major contributions. In the first stage, we devise a Multi-Scale-Spatial-Attention (MSSA) module to reduce calculation cost on non-salient regions while exploiting multi-scale saliency information. Such a sliced attention method offers an individual way to efficiently exploit the high-level features of the network with an enlarged receptive field. The second stage is to propose a Pyramid-Saliency-Shift-Aware (PSSA) module, which puts emphasis on the importance of dynamic object information since it offers a valid shift cue to confirm salient object and capture temporal information. Such a temporal detection module is able to encourage precise salient region detection. Exhaustive experiments show that the proposed STA-Net is effective for video salient object detection task, and achieves compelling performance in comparison with state-of-the-art.

引用

页码：3450 / 3459

页数：10

共 38 条

[1]

Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473

[2]

Bi HB, 2019, IEEE IMAGE PROC, P4654, DOI [10.1109/ICIP.2019.8803611, 10.1109/icip.2019.8803611]

[3] Video Saliency Detection via Spatial-Temporal Fusion and Low-Rank Coherency Diffusion [J].

Chen, Chenglizhao ;

Li, Shuai ;

Wang, Yongguang ;

Qin, Hong ;

Hao, Aimin .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2017, 26 (07) :3156-3170

[4] Review of Development Status of Bi2Te3-Based Semiconductor Thermoelectric Power Generation [J].

Chen, Yan ;

Hou, Xiangnan ;

Ma, Chunyan ;

Dou, Yinke ;

Wu, Wentao .

ADVANCES IN MATERIALS SCIENCE AND ENGINEERING, 2018, 2018

[5] SCOM: Spatiotemporal Constrained Optimization for Salient Object Detection [J].

Chen, Yuhuan ;

Zou, Wenbin ;

Tang, Yi ;

Li, Xia ;

Xu, Chen ;

Komodakis, Nikos .

IEEE TRANSACTIONS ON IMAGE PROCESSING, 2018, 27 (07) :3345-3357

[6] Structure-measure: A New Way to Evaluate Foreground Maps [J].

Fan, Deng-Ping ;

Cheng, Ming-Ming ;

Liu, Yun ;

Li, Tao ;

Borji, Ali .

2017 IEEE INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV), 2017, :4558-4567

[7] Shifting More Attention to Video Salient Object Detection [J].

Fan, Deng-Ping ;

Wang, Wenguan ;

Cheng, Ming-Ming ;

Shen, Jianbing .

2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8546-8556

[8] Graph Construction for Salient Object Detection in Videos [J].

Fu, Keren ;

Gu, Irene Y. H. ;

Yun, Yixiao ;

Gong, Chen ;

Yang, Jie .

2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR), 2014, :2371-2376

[9]

Fukuchi K, 2009, IEEE INT CON MULTI, P638, DOI 10.1109/ICME.2009.5202577

[10] Siamese attentional keypoint network for high performance visual tracking [J].

Gao, Peng ;

Yuan, Ruyue ;

Wang, Fei ;

Xiao, Liyi ;

Fujita, Hamido ;

Zhang, Yan .

KNOWLEDGE-BASED SYSTEMS, 2020, 193

← 1 2 3 4 →