What stands out in a scene? A study of human explicit saliency judgment

被引:115
作者
Borji, Ali [1 ]
Sihite, Dicky N. [1 ]
Itti, Laurent [1 ,2 ,3 ]
机构
[1] Univ So Calif, Dept Comp Sci, Los Angeles, CA 90089 USA
[2] Univ So Calif, Neurosci Grad Program, Los Angeles, CA 90089 USA
[3] Univ So Calif, Dept Psychol, Los Angeles, CA 90089 USA
基金
美国国家科学基金会;
关键词
Explicit saliency judgment; Space-based attention; Eye movements; Bottom-up saliency; Free viewing; Object-based attention; SELECTIVE VISUAL-ATTENTION; EYE-MOVEMENTS; SEARCH; MODEL; GUIDANCE; MECHANISMS; ALLOCATION; LOCATIONS; FEATURES; OBJECTS;
D O I
10.1016/j.visres.2013.07.016
中图分类号
Q189 [神经科学];
学科分类号
071006 ;
摘要
Eye tracking has become the de facto standard measure of visual attention in tasks that range from free viewing to complex daily activities. In particular, saliency models are often evaluated by their ability to predict human gaze patterns. However, fixations are not only influenced by bottom-up saliency (computed by the models), but also by many top-down factors. Thus, comparing bottom-up saliency maps to eye fixations is challenging and has required that one tries to minimize top-down influences, for example by focusing on early fixations on a stimulus. Here we propose two complementary procedures to evaluate visual saliency. We seek whether humans have explicit and conscious access to the saliency computations believed, to contribute to guiding attention and eye movements. In the first experiment, 70 observers were asked to choose which object stands out the most based on its low-level features in 100 images each containing only two objects. Using several state-of-the-art bottom-up visual saliency models that measure local and global spatial image outliers, we show that maximum saliency inside the selected object is significantly higher than inside the non-selected object and the background. Thus spatial outliers are a predictor of human judgments. Performance of this predictor is boosted by including object size as an additional feature. In the second experiment, observers were asked to draw a polygon circumscribing the most salient object in cluttered scenes. For each of 120 images, we show that a map built from annotations of 70 observers explains eye fixations of another 20 observers freely viewing the images, significantly above chance (dataset by Bruce and Tsotsos (2009); shuffled AUC score 0.62 +/- 0.07, chance 0.50, t-test p < 0.05). We conclude that fixations agree with saliency judgments, and classic bottom-up saliency models explain both. We further find that computational models specifically designed for fixation prediction slightly outperform models designed for salient object detection over both types of data (i.e., fixations and objects). Published by Elsevier Ltd.
引用
收藏
页码:62 / 77
页数:16
相关论文
共 87 条
  • [61] Modeling the role of salience in the allocation of overt visual attention
    Parkhurst, D
    Law, K
    Niebur, E
    [J]. VISION RESEARCH, 2002, 42 (01) : 107 - 123
  • [62] Components of bottom-up gaze allocation in natural images
    Peters, RJ
    Iyer, A
    Itti, L
    Koch, C
    [J]. VISION RESEARCH, 2005, 45 (18) : 2397 - 2416
  • [63] Saccadic selectivity in complex visual search displays
    Pomplun, Marc
    [J]. VISION RESEARCH, 2006, 46 (12) : 1886 - 1900
  • [64] ORIENTING OF ATTENTION
    POSNER, MI
    [J]. QUARTERLY JOURNAL OF EXPERIMENTAL PSYCHOLOGY, 1980, 32 (FEB): : 3 - 25
  • [65] Blinks slow memory-guided saccades
    Powers, Alice S.
    Basso, Michele A.
    Evinger, Craig
    [J]. JOURNAL OF NEUROPHYSIOLOGY, 2013, 109 (03) : 734 - 741
  • [66] Visual search in noise: Revealing the influence of structural cues by gaze-contingent classification image analysis
    Rajashekar, Umesh
    Bovik, Alan C.
    Cormack, Lawrence K.
    [J]. JOURNAL OF VISION, 2006, 6 (04): : 379 - 386
  • [67] EYE GUIDANCE IN READING - FIXATION LOCATIONS WITHIN WORDS
    RAYNER, K
    [J]. PERCEPTION, 1979, 8 (01) : 21 - 30
  • [68] Natural scene statistics at the centre of gaze
    Reinagel, P
    Zador, AM
    [J]. NETWORK-COMPUTATION IN NEURAL SYSTEMS, 1999, 10 (04) : 341 - 350
  • [69] LabelMe: A database and web-based tool for image annotation
    Russell, Bryan C.
    Torralba, Antonio
    Murphy, Kevin P.
    Freeman, William T.
    [J]. INTERNATIONAL JOURNAL OF COMPUTER VISION, 2008, 77 (1-3) : 157 - 173
  • [70] Static and space-time visual saliency detection by self-resemblance
    Seo, Hae Jong
    Milanfar, Peyman
    [J]. JOURNAL OF VISION, 2009, 9 (12):