An Object-Based Visual Attention Model for Robotic Applications

被引:52
作者
Yu, Yuanlong [1 ]
Mann, George K. I. [1 ]
Gosine, Raymond G. [1 ]
机构
[1] Mem Univ Newfoundland, Fac Engn & Appl Sci, St John, NF A1B 3X5, Canada
来源
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS | 2010年 / 40卷 / 05期
基金
加拿大自然科学与工程研究理事会;
关键词
Integrated competition (IC) hypothesis; mobile robotics; object-based visual attention; top-down biasing; PREATTENTIVE VISION; ORIENTATION; CONTOUR; SEARCH; CORTEX; SPACE;
D O I
10.1109/TSMCB.2009.2038895
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
By extending integrated competition hypothesis, this paper presents an object-based visual attention model, which selects one object of interest using low-dimensional features, resulting that visual perception starts from a fast attentional selection procedure. The proposed attention model involves seven modules: learning of object representations stored in a long-term memory (LTM), preattentive processing, top-down biasing, bottom-up competition, mediation between top-down and bottom-up ways, generation of saliency maps, and perceptual completion processing. It works in two phases: learning phase and attending phase. In the learning phase, the corresponding object representation is trained statistically when one object is attended. A dual-coding object representation consisting of local and global codings is proposed. Intensity, color, and orientation features are used to build the local coding, and a contour feature is employed to constitute the global coding. In the attending phase, the model preattentively segments the visual field into discrete proto-objects using Gestalt rules at first. If a task-specific object is given, the model recalls the corresponding representation from LTM and deduces the task-relevant feature(s) to evaluate top-down biases. The mediation between automatic bottom-up competition and conscious top-down biasing is then performed to yield a location-based saliency map. By combination of location-based saliency within each proto-object, the proto-object-based saliency is evaluated. The most salient proto-object is selected for attention, and it is finally put into the perceptual completion processing module to yield a complete object region. This model has been applied into distinct tasks of robots: detection of task-specific stationary and moving objects. Experimental results under different conditions are shown to validate this model.
引用
收藏
页码:1398 / 1412
页数:15
相关论文
共 58 条
[1]   SPATIOTEMPORAL ENERGY MODELS FOR THE PERCEPTION OF MOTION [J].
ADELSON, EH ;
BERGEN, JR .
JOURNAL OF THE OPTICAL SOCIETY OF AMERICA A-OPTICS IMAGE SCIENCE AND VISION, 1985, 2 (02) :284-299
[2]  
Araki S, 1998, INT C PATT RECOG, P1433, DOI 10.1109/ICPR.1998.711972
[3]  
Argyros AA, 2001, PROC CVPR IEEE, P3
[4]  
B. S. Incorporated, 2006, WATERGEMS V8 US MAN
[5]   Bottom-up gaze shifts and fixations learning by imitation [J].
Belardinelli, Anna ;
Pirri, Fiora ;
Carbone, Andrea .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (02) :256-271
[6]  
Blake A., 1998, ACTIVE CONTOURS APPL
[7]   PREATTENTIVE VISION AND PERCEPTUAL GROUPS [J].
BRAVO, M ;
BLAKE, R .
PERCEPTION, 1990, 19 (04) :515-522
[8]   Active vision for sociable robots [J].
Breazeal, C ;
Edsinger, A ;
Fitzpatrick, P ;
Scassellati, B .
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART A-SYSTEMS AND HUMANS, 2001, 31 (05) :443-453
[9]   THE LAPLACIAN PYRAMID AS A COMPACT IMAGE CODE [J].
BURT, PJ ;
ADELSON, EH .
IEEE TRANSACTIONS ON COMMUNICATIONS, 1983, 31 (04) :532-540
[10]  
CARBONE A, 2007, P 4 WAPCV, P431