Using semantic content as cues for better scanpath prediction

被引:11
作者
Cerf, Moran [1 ]
Frady, E. Paxon [1 ]
Koch, Christof [1 ]
机构
[1] CALTECH, Pasadena, CA 91125 USA
来源
PROCEEDINGS OF THE EYE TRACKING RESEARCH AND APPLICATIONS SYMPOSIUM (ETRA 2008) | 2008年
关键词
Eye Tracking; Psychophysics; Natural Scenes;
D O I
10.1145/1344471.1344508
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Under natural viewing conditions, human observers use shifts in gaze to allocated processing resources to subsets of the visual input. There are many computational models that try to predict these shifts in eye movement and attention. Although the important role of high level stimulus properties (e.g., semantic information) stands undisputed, most model are based solely on low-level images properties. We have demonstrate that a combined model of high-level object detection and low-level saliency significantly outerperforms a low-level saliency model in predicting locations humans fixate on. The data is based on eye-movement recordings of humans observing photographs of natural scenes, which contained one of the following high-level stimuli: faces, text, scrambled text or cell phones. We show that observers - even when not instructed to look for anything particular, fixate on a face with a probability of over 80% within their first two fixations, on text and scrambled text with a probability of over 65.1% and 57.9% respectively, and on cell phones with probability of 8.3%. This suggests that content with meaningful semantic information is significantly more likely to be seen earlier. Adding regions of interest (ROI), which depict the locations of the high-level meaningful features, significantly improves the prediction of a saliency model for Stimuli with high semantic importance, while it has little effect for an object with no semantic meaning.
引用
收藏
页码:143 / 146
页数:4
相关论文
共 12 条
[1]  
CERF M, 2008, ADV NUERAL INFORM PR, V20
[2]   The relation of phase noise and luminance contrast to overt attention in complex visual stimuli [J].
Einhaeuser, Wolfgang ;
Rutishauser, Ueli ;
Frady, E. Paxon ;
Nadler, Swantje ;
Koenig, Peter ;
Koch, Christof .
JOURNAL OF VISION, 2006, 6 (11) :1148-1158
[3]   At first sight: A high-level pop out effect for faces [J].
Hershler, O ;
Hochstein, S .
VISION RESEARCH, 2005, 45 (13) :1707-1724
[4]   A model of saliency-based visual attention for rapid scene analysis [J].
Itti, L ;
Koch, C ;
Niebur, E .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 1998, 20 (11) :1254-1259
[5]   Computational modelling of visual attention [J].
Itti, L ;
Koch, C .
NATURE REVIEWS NEUROSCIENCE, 2001, 2 (03) :194-203
[6]   A saliency-based search mechanism for overt and covert shifts of visual attention [J].
Itti, L ;
Koch, C .
VISION RESEARCH, 2000, 40 (10-12) :1489-1506
[7]  
James W., 1950, Principles of Psychology, V2
[8]   NEWBORNS PREFERENTIAL TRACKING OF FACE-LIKE STIMULI AND ITS SUBSEQUENT DECLINE [J].
JOHNSON, MH ;
DZIURAWIEC, S ;
ELLIS, H ;
MORTON, J .
COGNITION, 1991, 40 (1-2) :1-19
[9]  
OLIVA A, 2003, IM PROC 2003 P 2003
[10]   Components of bottom-up gaze allocation in natural images [J].
Peters, RJ ;
Iyer, A ;
Itti, L ;
Koch, C .
VISION RESEARCH, 2005, 45 (18) :2397-2416