Visual Saliency Based on Scale-Space Analysis in the Frequency Domain

被引:455
作者
Li, Jian [1 ]
Levine, Martin D. [2 ,3 ]
An, Xiangjing [1 ]
Xu, Xin [1 ]
He, Hangen [1 ]
机构
[1] Natl Univ Def Technol, Inst Automat, Changsha 410073, Hunan, Peoples R China
[2] McGill Univ, Dept Elect & Comp Engn, Montreal, PQ H3A 2A7, Canada
[3] McGill Univ, CIM, Montreal, PQ H3A 2A7, Canada
基金
中国国家自然科学基金;
关键词
Visual attention; saliency; hypercomplex Fourier transform; eye tracking; scale space analysis; ATTENTION; IMAGE; MODEL; SEARCH;
D O I
10.1109/TPAMI.2012.147
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We address the issue of visual saliency from three perspectives. First, we consider saliency detection as a frequency domain analysis problem. Second, we achieve this by employing the concept of nonsaliency. Third, we simultaneously consider the detection of salient regions of different size. The paper proposes a new bottom-up paradigm for detecting visual saliency, characterized by a scale-space analysis of the amplitude spectrum of natural images. We show that the convolution of the image amplitude spectrum with a low-pass Gaussian kernel of an appropriate scale is equivalent to an image saliency detector. The saliency map is obtained by reconstructing the 2D signal using the original phase and the amplitude spectrum, filtered at a scale selected by minimizing saliency map entropy. A Hypercomplex Fourier Transform performs the analysis in the frequency domain. Using available databases, we demonstrate experimentally that the proposed model can predict human fixation data. We also introduce a new image database and use it to show that the saliency detector can highlight both small and large salient regions, as well as inhibit repeated distractors in cluttered images. In addition, we show that it is able to predict salient regions on which people focus their attention.
引用
收藏
页码:996 / 1010
页数:15
相关论文
共 47 条
[1]   AUTOMATIC THRESHOLDING OF GRAY-LEVEL PICTURES USING TWO-DIMENSIONAL ENTROPY [J].
ABUTALEB, AS .
COMPUTER VISION GRAPHICS AND IMAGE PROCESSING, 1989, 47 (01) :22-32
[2]  
Achanta R, 2009, PROC CVPR IEEE, P1597, DOI 10.1109/CVPRW.2009.5206596
[3]  
[Anonymous], 1967, Cognitive Psychology
[4]  
[Anonymous], 2009, Vision Res., DOI [DOI 10.1016/J.VISRES.2008.09.007, 10.1016/j.visres.2008.09.007]
[5]  
[Anonymous], 2007, P NIPS
[6]   Esaliency (Extended Saliency): Meaningful Attention Using Stochastic Image Modeling [J].
Avraham, Tamar ;
Lindenbaum, Michael .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2010, 32 (04) :693-708
[7]   Stimulus context modulates competition in human extrastriate cortex [J].
Beck, DM ;
Kastner, S .
NATURE NEUROSCIENCE, 2005, 8 (08) :1110-1116
[8]  
Bruce N., 2006, P ADV NEUR INF PROC
[9]  
Cerf M., 2008, P ADV NEUR INF PROC
[10]   A FAST 2-DIMENSIONAL ENTROPIC THRESHOLDING ALGORITHM [J].
CHEN, WT ;
WEN, CH ;
YANG, CW .
PATTERN RECOGNITION, 1994, 27 (07) :885-893