STI-Net: Spatiotemporal integration network for video saliency detection

被引:14
|
作者
Zhou, Xiaofei [1 ]
Cao, Weipeng [2 ]
Gao, Hanxiao [1 ]
Ming, Zhong [2 ]
Zhang, Jiyong [1 ]
机构
[1] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310018, Peoples R China
[2] Guangdong Lab Artificial Intelligence & Digital Ec, Shenzhen 518107, Peoples R China
基金
中国国家自然科学基金;
关键词
Spatiotemporal saliency; Feature aggregation; Saliency prediction; Saliency fusion; OBJECT DETECTION; FUSION; SEGMENTATION; ATTENTION; FEATURES;
D O I
10.1016/j.ins.2023.01.106
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Image saliency detection, to which much effort has been devoted in recent years, has advanced significantly. In contrast, the community has paid little attention to video saliency detection. Especially, existing video saliency models are very likely to fail in videos with difficult scenarios such as fast motion, dynamic background, and nonrigid deformation. Furthermore, performing video saliency detection directly using image saliency models that ignore video temporal information is inappropriate. To alleviate this issue, this study proposes a novel end-to-end spatiotemporal integration network (STI-Net) for detecting salient objects in videos. Specifically, our method is made up of three key steps: feature aggregation, saliency prediction, and saliency fusion, which are used sequentially to generate spatiotemporal deep feature maps, coarse saliency predictions, and the final saliency map. The key advantage of our model lies in the comprehensive exploration of spatial and temporal information across the entire network, where the two features interact with each other in the feature aggregation step, are used to construct boundary cue in the saliency prediction step, and also serve as the original information in the saliency fusion step. As a result, the generated spatiotemporal deep feature maps can precisely and completely characterize the salient objects, and the coarse saliency predictions have well-defined boundaries, effectively improving the final saliency map's quality. Furthermore, "shortcut connections" are introduced into our model to make the proposed network easy to train and obtain accurate results when the network is deep. Extensive experimental results on two publicly available challenging video datasets demonstrate the effectiveness of the proposed model, which achieves comparable performance to state-of-the-art saliency models.
引用
收藏
页码:134 / 147
页数:14
相关论文
共 50 条
  • [41] Spatiotemporal Saliency Estimation by Spectral Foreground Detection
    Aytekin, Caglar
    Possegger, Horst
    Mauthner, Thomas
    Kiranyaz, Serkan
    Bischof, Horst
    Gabbouj, Moncef
    IEEE TRANSACTIONS ON MULTIMEDIA, 2018, 20 (01) : 82 - 95
  • [42] MOTION-DECISION BASED SPATIOTEMPORAL SALIENCY FOR VIDEO SEQUENCES
    Zhu, Yaping
    Jacobson, Natan
    Pan, Hong
    Truong Nguyen
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 1333 - 1336
  • [43] Video Saliency Prediction Using Spatiotemporal Residual Attentive Networks
    Lai, Qiuxia
    Wang, Wenguan
    Sun, Hanqiu
    Shen, Jianbing
    IEEE TRANSACTIONS ON IMAGE PROCESSING, 2020, 29 : 1113 - 1126
  • [44] An Effective Video Saliency Detection Model Based on Human Visual Acuity and Spatiotemporal Cues in Cloud Systems
    Fang, Zhijun
    Zhang, Juan
    Wan, Wanggen
    Fang, Yuming
    JOURNAL OF INTERNET TECHNOLOGY, 2014, 15 (05): : 835 - 840
  • [45] A novel visual saliency detection method for infrared video sequences
    Wang, Xin
    Zhang, Yuzhen
    Ning, Chen
    INFRARED PHYSICS & TECHNOLOGY, 2017, 87 : 91 - 103
  • [46] Automatic Foreground Seeds Discovery for Robust Video Saliency Detection
    Zhang, Lin
    Lu, Yao
    Zhou, Tianfei
    ADVANCES IN MULTIMEDIA INFORMATION PROCESSING - PCM 2017, PT II, 2018, 10736 : 89 - 97
  • [47] Video Saliency Detection Using Deep Convolutional Neural Networks
    Zhou, Xiaofei
    Liu, Zhi
    Gong, Chen
    Li, Gongyang
    Huang, Mengke
    PATTERN RECOGNITION AND COMPUTER VISION, PT II, 2018, 11257 : 308 - 319
  • [48] Structure-Aware Adaptive Diffusion for Video Saliency Detection
    Chen, Chenglizhao
    Wang, Guotao
    Peng, Chong
    IEEE ACCESS, 2019, 7 : 79770 - 79782
  • [49] Video Salient Object Detection Network with Bidirectional Memory and Spatiotemporal Constraints
    Wang, Hongyu
    Mu, Nan
    Zhang, Yu
    2021 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2021, : 2781 - 2786
  • [50] Moving Object Segmentation in Video using Spatiotemporal Saliency and Laplacian Coordinates
    Ramadan, Hiba
    Tairi, Hamid
    2016 IEEE/ACS 13TH INTERNATIONAL CONFERENCE OF COMPUTER SYSTEMS AND APPLICATIONS (AICCSA), 2016,