Machine Recognition of Human Activities: A Survey

被引:825
作者
Turaga, Pavan [1 ]
Chellappa, Rama [1 ]
Subrahmanian, V. S. [1 ]
Udrea, Octavian [1 ]
机构
[1] Univ Maryland, Inst Adv Comp Studies, College Pk, MD 20742 USA
关键词
Human activity analysis; image sequence analysis; machine vision; surveillance;
D O I
10.1109/TCSVT.2008.2005594
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
The past decade has witnessed a rapid proliferation of video cameras in all walks of life and has resulted in a tremendous explosion of video content. Several applications such as content-based video annotation and retrieval, highlight extraction and video summarization require recognition of the activities occurring in the video. The analysis of human activities in videos is an area with increasingly important consequences from security and surveillance to entertainment and personal archiving. Several challenges at various levels of processing-robustness against errors in low-level processing, view and rate-invariant representations at midlevel processing and semantic representation of human activities at higher level processing-make this problem hard to solve. In this review paper, we present a comprehensive survey of efforts in the past couple of decades to address the problems of representation, recognition, and learning of human activities from video and related applications. We discuss the problem at two major levels of complexity: 1) "actions" and 2) "activities." "Actions" are characterized by simple motion patterns typically executed by a single human. "Activities" are more complex and involve coordinated actions among a small number of humans. We will discuss several approaches and classify them according to their ability to handle varying degrees of complexity as interpreted above. We begin with a discussion of approaches to model the simplest of action classes known as atomic or primitive actions that do not require sophisticated dynamical modeling. Then, methods to model actions with more complex dynamics are discussed. The discussion then leads naturally to methods for higher level representation of complex activities.
引用
收藏
页码:1473 / 1488
页数:16
相关论文
共 144 条
[1]   Human motion analysis: A review [J].
Aggarwal, JK ;
Cai, Q .
COMPUTER VISION AND IMAGE UNDERSTANDING, 1999, 73 (03) :428-440
[2]  
Aho Alfred V., 1972, The theory of parsing, translation, and compiling
[3]   A Constrained Probabilistic Petri Net Framework for Human Activity Detection in Video [J].
Albanese, Massimiliano ;
Chellappa, Rama ;
Moscato, Vincenzo ;
Picariello, Antonio ;
Subrahmanian, V. S. ;
Turaga, Pavan ;
Udrea, Octavian .
IEEE TRANSACTIONS ON MULTIMEDIA, 2008, 10 (06) :982-996
[4]  
[Anonymous], [No title captured]
[5]   The computation of optical flow [J].
Beauchemin, SS ;
Barron, JL .
ACM COMPUTING SURVEYS, 1995, 27 (03) :433-467
[6]  
Belkin M, 2002, ADV NEUR IN, V14, P585
[7]  
Bissacco A, 2001, PROC CVPR IEEE, P52
[8]  
BISSACCO A, 1920, INT S MATH THEOR NET
[9]  
Bissacco A., 2007, P IEEE C COMP VIS PA, DOI DOI 10.1109/CVPR.2007.383129
[10]   SHAPE DESCRIPTION USING WEIGHTED SYMMETRIC AXIS FEATURES [J].
BLUM, H ;
NAGEL, RN .
PATTERN RECOGNITION, 1978, 10 (03) :167-180