Online learning of task-driven object-based visual attention control

被引：47

作者：

Borji, Ali ^{[1
,2
]}

Ahmadabadi, Majid Nil ^{[1
,3
]}

Araabi, Babak Nadjar ^{[1
,3
]}

Hamidi, Mandana ^{[4
]}

机构：

[1] Inst Res Fundamental Sci, Sch Cognit Sci, Tehran, Iran

[2] Univ Bonn, Dept Comp Sci 3, D-5300 Bonn, Germany

[3] Univ Tehran, Dept Elect & Comp Engn, Control & Intelligent Proc Ctr Excellence, Tehran, Iran

[4] IIT, I-16163 Genoa, Italy

来源：

IMAGE AND VISION COMPUTING | 2010年 / 28卷 / 07期

关键词：

Task-driven attention; Object-based attention; Top-down attention; Saliency-based model; Reinforcement learning; State space discretization; RECOGNITION; SCENE; MODELS; TIME;

D O I：

10.1016/j.imavis.2009.10.006

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We propose a biologically-motivated computational model for learning task-driven and object-based visual attention control in interactive environments. In this model, top-down attention is learned interactively and is used to search for a desired object in the scene through biasing the bottom-up attention in order to form a need-based and object-driven state representation of the environment. Our model consists of three layers. First, in the early visual processing layer, most salient location of a scene is derived using the biased saliency-based bottom-up model of visual attention. Then a cognitive component in the higher visual processing layer performs an application specific operation like object recognition at the focus of attention. From this information, a state is derived in the decision making and learning layer. Top-down attention is learned by the U-TREE algorithm which successively grows an object-based binary tree. Internal nodes in this tree check the existence of a specific object in the scene by biasing the early vision and the object recognition parts. Its leaves point to states in the action value table. Motor actions are associated with the leaves. After performing a motor action, the agent receives a reinforcement signal from the critic. This signal is alternately used for modifying the tree or updating the action selection policy. The proposed model is evaluated on visual navigation tasks, where obtained results lend support to the applicability and usefulness of the developed method for robotics. (C) 2009 Elsevier B.V. All rights reserved.

引用

页码：1130 / 1145

页数：16

共 52 条

[1]

Asadpour M, 2006, SPRINGER TRAC ADV RO, V22, P79

[2]

BORJI A, WORKSH MOT INT LEARN

[3] Robust handwritten character recognition with features inspired by visual ventral stream [J].

Borji, Ali ;

Hamidi, Mandana ;

Mahmoudi, Fariborz .

NEURAL PROCESSING LETTERS, 2008, 28 (02) :97-111

[4] Cost-sensitive learning of top-down modulation for attentional control [J].

Borji, Ali ;

Ahmadabadi, Majid N. ;

Araabi, Babak N. .

MACHINE VISION AND APPLICATIONS, 2011, 22 (01) :61-76

[5]

Chun M.M., 2001, BLACKWELL HDB SENSAT, P272

[6]

CLARK A, 1999, J COGNITIVE SYSTEMS, V1, P5

[7] Visual attention: Bottom-up versus top-down [J].

Connor, CE ;

Egeth, HE ;

Yantis, S .

CURRENT BIOLOGY, 2004, 14 (19) :R850-R852

[8] Control of goal-directed and stimulus-driven attention in the brain [J].

Corbetta, M ;

Shulman, GL .

NATURE REVIEWS NEUROSCIENCE, 2002, 3 (03) :201-215

[9] SELECTIVE ATTENTION AND THE ORGANIZATION OF VISUAL INFORMATION [J].

DUNCAN, J .

JOURNAL OF EXPERIMENTAL PSYCHOLOGY-GENERAL, 1984, 113 (04) :501-517

[10] Visual attention: Control, representation, and time course [J].

Egeth, HE ;

Yantis, S .

ANNUAL REVIEW OF PSYCHOLOGY, 1997, 48 :269-297

← 1 2 3 4 5 6 →