Video Saliency Prediction Based on Spatial-Temporal Two-Stream Network

被引:57
|
作者
Zhang, Kao [1 ]
Chen, Zhenzhong [1 ]
机构
[1] Wuhan Univ, Sch Remote Sensing & Informat Engn, Wuhan 430079, Hubei, Peoples R China
基金
中国国家自然科学基金; 国家重点研发计划;
关键词
Feature extraction; Predictive models; Streaming media; Visualization; Spatiotemporal phenomena; Computational modeling; Video saliency; spatial-temporal features; visual attention; deep learning; SPATIOTEMPORAL SALIENCY; COMPRESSED-DOMAIN; VISUAL-ATTENTION; MODEL; GAZE;
D O I
10.1109/TCSVT.2018.2883305
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a novel two-stream neural network for video saliency prediction. Unlike some traditional methods based on hand-crafted feature extraction and integration, our proposed method automatically learns saliency related spatiotemporal features from human fixations without any pre-processing, post-processing, or manual tuning. Video frames are routed through the spatial stream network to compute static or color saliency maps for each of them. And a new two-stage temporal stream network is proposed, which is composed of a pre-trained 2D-CNN model (SF-Net) to extract saliency related features and a shallow 3D-CNN model (Te-Net) to process these features, for temporal or dynamic saliency maps. It can reduce the requirement of video gaze data, improve training efficiency, and achieve high performance. A fusion network is adopted to combine the outputs of both streams and generate the final saliency maps. Besides, a convolutional Gaussian priors (CGP) layer is proposed to learn the bias phenomenon in viewing behavior to improve the performance of the video saliency prediction. The proposed method is compared with state-of-the-art saliency models on two public video saliency benchmark datasets. The results demonstrate that our model can achieve advanced performance on video saliency prediction.
引用
收藏
页码:3544 / 3557
页数:14
相关论文
共 50 条
  • [21] Spatio-Temporal Learning for Video Deblurring based on Two-Stream Generative Adversarial Network
    Song, Liyao
    Wang, Quan
    Lie, Haiwei
    Fan, Jiancun
    Hu, Bingliang
    NEURAL PROCESSING LETTERS, 2021, 53 (04) : 2701 - 2714
  • [22] Temporal Shift and Spatial Attention-Based Two-Stream Network for Traffic Risk Assessment
    Liu, Chunsheng
    Li, Zijian
    Chang, Faliang
    Li, Shuang
    Xie, Jincan
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 12518 - 12530
  • [23] Two-Stream Recurrent Convolutional Neural Networks for Video Saliency Estimation
    Wei, Xiao
    Song, Li
    Xie, Rong
    Zhang, Wenjun
    2017 IEEE INTERNATIONAL SYMPOSIUM ON BROADBAND MULTIMEDIA SYSTEMS AND BROADCASTING (BMSB), 2017, : 419 - 423
  • [24] Saliency detection network with two-stream encoder and interactive decoder
    Yang, Aiping
    Cheng, Simeng
    Song, Shangyang
    Wang, Jinbin
    Ji, Zhong
    Pang, Yanwei
    Cao, Jiale
    NEUROCOMPUTING, 2022, 509 : 56 - 67
  • [25] Full Convolutional Network Based on Spatial-Temporal Features for the Video Eye Fixation Prediction
    Shi J.
    Sun M.
    Wang Z.
    Zhang D.
    Tianjin Daxue Xuebao (Ziran Kexue yu Gongcheng Jishu Ban)/Journal of Tianjin University Science and Technology, 2019, 52 (10): : 1062 - 1068
  • [26] A Study on Video Anomalous Behavior Recognition Based on Two-Stream Network
    Luo, Xiaodong
    Liu, Ying
    Hao, Yu
    Du, Huimin
    2024 6TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING, ICNLP 2024, 2024, : 469 - 473
  • [27] A Dual-Branch Spatial-Temporal Learning Network for Video Prediction
    Huang, Huilin
    Guan, Yepeng
    IEEE ACCESS, 2024, 12 : 73258 - 73267
  • [28] Human Activities Recognition Based on Two-stream NonLocal Spatial Temporal Residual Convolution Neural Network
    Qian H.
    Chen S.
    Huangfu X.
    Dianzi Yu Xinxi Xuebao/Journal of Electronics and Information Technology, 2024, 46 (03): : 1100 - 1108
  • [29] TWO-STREAM REFINEMENT NETWORK FOR RGB-D SALIENCY DETECTION
    Liu, Di
    Hu, Yaosi
    Zhang, Kao
    Chen, Zhenzhong
    2019 IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP), 2019, : 3925 - 3929
  • [30] Blood flow characterization in nailfold capillary using optical flow-assisted two-stream network and spatial-temporal image
    Chen, Shupin
    Wei, Dan
    Gu, Shenming
    Yang, Zhangru
    BIOMEDICAL PHYSICS & ENGINEERING EXPRESS, 2023, 9 (04)