Top-down control of visual attention in object detection.

被引：264

作者：

Oliva, A ^{[1
]}

Torralba, A ^{[1
]}

Castelhano, MS ^{[1
]}

Henderson, JM ^{[1
]}

机构：

[1] Michigan State Univ, E Lansing, MI 48824 USA

来源：

2003 INTERNATIONAL CONFERENCE ON IMAGE PROCESSING, VOL 1, PROCEEDINGS | 2003年

关键词：

D O I：

10.1109/icip.2003.1246946

中图分类号：

TB8 [摄影技术];

学科分类号：

0804 ;

摘要：

Current computational models of visual attention focus on bottom-up information and ignore scene context. However, studies in visual cognition show that humans use context to facilitate object detection in natural scenes by directing their attention or eyes to diagnostic regions. Here we propose a model of attention guidance based on global scene configuration. We show that the statistics of low-level features across the scene image determine where a specific object (e.g. a person) should be located. Human eye movements show that regions chosen by the top-down model agree with regions scrutinized by human observers performing a visual search task for people. The results validate the proposition that top-down information from visual context modulates the saliency of image regions during the task of object detection. Contextual information provides a shortcut for efficient object detection systems.

引用

页码：253 / 256

页数：4

共 13 条

[1]

CASTELHANO MS, 2003, 3 ANN M VIS SCI SOC

[2] The effects of semantic consistency on eye movements during complex scene viewing [J].