Going from Image to Video Saliency: Augmenting Image Salience with Dynamic Attentional Push

被引:42
作者
Gorji, Siavash [1 ]
Clark, James J. [1 ]
机构
[1] McGill Univ, Dept Elect & Comp Engn, Ctr Intelligent Machines, Montreal, PQ, Canada
来源
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) | 2018年
关键词
VISUAL-ATTENTION; DETECTION MODEL; SCENE; GAZE; EYES;
D O I
10.1109/CVPR.2018.00783
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We present a novel method to incorporate the recent advent in static saliency models to predict the saliency in videos. Our model augments the static saliency models with the Attentional Push effect of the photographer and the scene actors in a shared attention setting. We demonstrate that not only it is imperative to use static Attentional Push cues, noticeable performance improvement is achievable by learning the time-varying nature of Attentional Push. We propose a multi-stream Convolutional Long Short-Term Memory network (ConvLSTM) structure which augments state-of-the-art in static saliency models with dynamic Attentional Push. Our network contains four pathways, a saliency pathway and three Attentional Push pathways. The multi-pathway structure is followed by an augmenting convnet that learns to combine the complementary and time-varying outputs of the ConvLSTMs by minimizing the relative entropy between the augmented saliency and viewers fixation patterns on videos. We evaluate our model by comparing the performance of several augmented static saliency models with state-of-the-art in spatiotemporal saliency on three largest dynamic eye tracking datasets, HOLLYWOOD2, UCF-Sport and DIEM. Experimental results illustrates that solid performance gain is achievable using the proposed methodology.
引用
收藏
页码:7501 / 7511
页数:11
相关论文
共 67 条
[51]   My eyes want to look where your eyes are looking: Exploring the tendency to imitate another individual's gaze [J].
Ricciardelli, P ;
Bricolo, E ;
Aglioti, SM ;
Chelazzi, L .
NEUROREPORT, 2002, 13 (17) :2259-2264
[52]   RARE2012: A multi-scale rarity-based saliency detection with its comparative statistical analysis [J].
Riche, Nicolas ;
Mancas, Matei ;
Duvinage, Matthieu ;
Mibulumukini, Makiese ;
Gosselin, Bernard ;
Dutoit, Thierry .
SIGNAL PROCESSING-IMAGE COMMUNICATION, 2013, 28 (06) :642-658
[53]   Do Predictions of Visual Perception Aid Design? [J].
Rosenholtz, Ruth ;
Dorai, Amal ;
Freeman, Rosalind .
ACM TRANSACTIONS ON APPLIED PERCEPTION, 2011, 8 (02)
[54]   Learning video saliency from human gaze using candidate selection [J].
Rudoy, Dmitry ;
Goldman, Dan B. ;
Shechtman, Eli ;
Zelnik-Manor, Lihi .
2013 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2013, :1147-1154
[55]   Top-down influences on visual attention during listening are modulated by observer sex [J].
Shen, John ;
Itti, Laurent .
VISION RESEARCH, 2012, 65 :62-76
[56]  
Simonyan K, 2015, Arxiv, DOI arXiv:1409.1556
[57]   The Attentional Theory of Cinematic Continuity [J].
Smith, Tim J. .
PROJECTIONS-THE JOURNAL FOR MOVIES AND MIND, 2012, 6 (01) :1-27
[58]  
Subramanian Ramanathan., 2011, Proceedings of the 19th ACM International Conference on Multimedia, P33
[59]   FEATURE-INTEGRATION THEORY OF ATTENTION [J].
TREISMAN, AM ;
GELADE, G .
COGNITIVE PSYCHOLOGY, 1980, 12 (01) :97-136
[60]   Quantifying center bias of observers in free viewing of dynamic natural scenes [J].
Tseng, Po-He ;
Carmi, Ran ;
Cameron, Ian G. M. ;
Munoz, Douglas P. ;
Itti, Laurent .
JOURNAL OF VISION, 2009, 9 (07)