An Object-Based Visual Attention Model for Robotic Applications

被引：52

作者：

Yu, Yuanlong ^{[1
]}

Mann, George K. I. ^{[1
]}

Gosine, Raymond G. ^{[1
]}

机构：

[1] Mem Univ Newfoundland, Fac Engn & Appl Sci, St John, NF A1B 3X5, Canada

来源：

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS | 2010年 / 40卷 / 05期

基金：

加拿大自然科学与工程研究理事会;

关键词：

Integrated competition (IC) hypothesis; mobile robotics; object-based visual attention; top-down biasing; PREATTENTIVE VISION; ORIENTATION; CONTOUR; SEARCH; CORTEX; SPACE;

D O I：

10.1109/TSMCB.2009.2038895

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

By extending integrated competition hypothesis, this paper presents an object-based visual attention model, which selects one object of interest using low-dimensional features, resulting that visual perception starts from a fast attentional selection procedure. The proposed attention model involves seven modules: learning of object representations stored in a long-term memory (LTM), preattentive processing, top-down biasing, bottom-up competition, mediation between top-down and bottom-up ways, generation of saliency maps, and perceptual completion processing. It works in two phases: learning phase and attending phase. In the learning phase, the corresponding object representation is trained statistically when one object is attended. A dual-coding object representation consisting of local and global codings is proposed. Intensity, color, and orientation features are used to build the local coding, and a contour feature is employed to constitute the global coding. In the attending phase, the model preattentively segments the visual field into discrete proto-objects using Gestalt rules at first. If a task-specific object is given, the model recalls the corresponding representation from LTM and deduces the task-relevant feature(s) to evaluate top-down biases. The mediation between automatic bottom-up competition and conscious top-down biasing is then performed to yield a location-based saliency map. By combination of location-based saliency within each proto-object, the proto-object-based saliency is evaluated. The most salient proto-object is selected for attention, and it is finally put into the perceptual completion processing module to yield a complete object region. This model has been applied into distinct tasks of robots: detection of task-specific stationary and moving objects. Experimental results under different conditions are shown to validate this model.

引用

页码：1398 / 1412

页数：15

共 58 条

[1] SPATIOTEMPORAL ENERGY MODELS FOR THE PERCEPTION OF MOTION [J].

ADELSON, EH ;

BERGEN, JR .

JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1985, 2 (02) :284-299

[2]

Araki S, 1998, INT C PATT RECOG, P1433, DOI 10.1109/ICPR.1998.711972

[3]

Argyros AA, 2001, PROC CVPR IEEE, P3

[4]

B. S. Incorporated, 2006, WATERGEMS V8 US MAN

[5] Bottom-up gaze shifts and fixations learning by imitation [J].

Belardinelli, Anna ;

Pirri, Fiora ;

Carbone, Andrea .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (02) :256-271

[6]

Blake A., 1998, ACTIVE CONTOURS APPL

[7] PREATTENTIVE VISION AND PERCEPTUAL GROUPS [J].

BRAVO, M ;

BLAKE, R .

PERCEPTION, 1990, 19 (04) :515-522

[8] Active vision for sociable robots [J].

Breazeal, C ;

Edsinger, A ;

Fitzpatrick, P ;

Scassellati, B .

IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2001, 31 (05) :443-453

[9] THE LAPLACIAN PYRAMID AS A COMPACT IMAGE CODE [J].

BURT, PJ ;

ADELSON, EH .

IEEE TRANSACTIONS ON COMMUNICATIONS, 1983, 31 (04) :532-540

[10]

CARBONE A, 2007, P 4 WAPCV, P431

← 1 2 3 4 5 6 →