Understanding dynamic scenes based on human sequence evaluation

被引:15
作者
Gonzalez, Jordi [1 ]
Rowe, Daniel [2 ,3 ]
Varona, Javier [4 ]
Xavier Roca, F. [2 ,3 ]
机构
[1] CSIC, UPC, Inst Robot & Informat Ind, Barcelona, Catalonia, Spain
[2] UAB, Comp Vision Ctr, Barcelona, Catalonia, Spain
[3] UAB, Dept Comp Sci, Barcelona, Catalonia, Spain
[4] UIB, Unitat Graf & Visio Ordinador, Palma de Mallorca, Spain
关键词
Image Sequence Evaluation; High-level processing of monitored scenes; Segmentation and tracking in complex scenes; Event recognition in dynamic scenes; Human motion understanding; Human behaviour interpretation; Natural-language text generation; Realistic demonstrators; REAL-TIME TRACKING; HUMAN MOVEMENT; RECOGNITION; MODELS;
D O I
10.1016/j.imavis.2008.02.004
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, a Cognitive Vision System (CVS) is presented, which explains the human behaviour of monitored scenes using natural-language texts. This cognitive analysis of human movements recorded in image sequences is here referred to as Human Sequence Evaluation (HSE) which defines a set of transformation modules involved in the automatic generation of semantic descriptions from pixel values. In essence, the trajectories of human agents are obtained to generate textual interpretations of their motion, and also to infer the conceptual relationships of each agent w.r.t. its environment. For this purpose, a human behaviour model based on Situation Graph Trees (SGTs) is considered, which permits both bottom-up (hypothesis generation) and top-down (hypothesis refinement) analysis of dynamic scenes. The resulting system prototype interprets different kinds of behaviour and reports textual descriptions in multiple languages. (C) 2008 Elsevier B.V. All rights reserved.
引用
收藏
页码:1433 / 1444
页数:12
相关论文
共 44 条
[1]   Human motion analysis: A review [J].
Aggarwal, JK ;
Cai, Q .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1999, 73 (03) :428-440
[2]  
[Anonymous], 2007, IBPRIA 07 P 3 IB C P
[3]  
[Anonymous], 1993, From discourse to logic
[4]  
[Anonymous], 2000, Proceedings of the 6th European Conference on Computer Vision-Part II (ECCV '00), DOI DOI 10.1007/3-540-45053-X_48
[5]  
Arens M, 2002, FRONT ARTIF INTEL AP, V77, P455
[6]  
ARENS M, 2003, SGTEDITOR V1 0 RFERE
[7]  
Bar-Shalom Y, 1988, Tracking and data association
[8]   The recognition of human movement using temporal templates [J].
Bobick, AF ;
Davis, JW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) :257-267
[9]   Real-time tracking for visual interface applications in cluttered and occluding situations [J].
Bullock, DJ ;
Zelek, JS .
IMAGE AND VISION COMPUTING, 2004, 22 (12) :1083-1091
[10]   Learning and understanding dynamic scene activity: a review [J].
Buxton, H .
IMAGE AND VISION COMPUTING, 2003, 21 (01) :125-136