Recognizing expressions from face and body gesture by temporal normalized motion and appearance features

被引:48
作者
Chen, Shizhi [1 ]
Tian, YingLi [1 ]
Liu, Qingshan [2 ]
Metaxas, Dimitris N. [3 ]
机构
[1] CUNY, Dept Elect Engn, New York, NY 10021 USA
[2] Nanjing Univ Informat Sci & Technol, Sch Informat & Control, Nanjing, Peoples R China
[3] Rutgers State Univ, Dept Comp Sci, Piscataway, NJ 08855 USA
基金
美国国家科学基金会;
关键词
Affect recognition; Facial feature; Body gesture; MHI-HOG; Image-HOG; FACIAL EXPRESSION; RECOGNITION; SEGMENTATION;
D O I
10.1016/j.imavis.2012.06.014
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recently, recognizing affects from both face and body gestures attracts more attentions. However, it still lacks of efficient and effective features to describe the dynamics of face and gestures for real-time automatic affect recognition. In this paper, we combine both local motion and appearance feature in a novel framework to model the temporal dynamics of face and body gesture. The proposed framework employs MHI-HOG and Image-HOG features through temporal normalization or bag of words to capture motion and appearance information. The MHI-HOG stands for Histogram of Oriented Gradients (HOG) on the Motion History Image (MHI). It captures motion direction and speed of a region of interest as an expression evolves over the time. The Image-HOG captures the appearance information of the corresponding region of interest. The temporal normalization method explicitly solves the time resolution issue in the video-based affect recognition. To implicitly model local temporal dynamics of an expression, we further propose a bag of words (BOW) based representation for both MHI-HOG and Image-HOG features. Experimental results demonstrate promising performance as compared with the state-of-the-art. Significant improvement of recognition accuracy is achieved as compared with the frame-based approach that does not consider the underlying temporal dynamics. (C) 2012 Elsevier B.V. All rights reserved.
引用
收藏
页码:175 / 185
页数:11
相关论文
共 51 条
[1]   A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation [J].
Alon, Jonathan ;
Athitsos, Vassilis ;
Yuan, Quan ;
Sclaroff, Stan .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2009, 31 (09) :1685-1699
[2]   THIN SLICES OF EXPRESSIVE BEHAVIOR AS PREDICTORS OF INTERPERSONAL CONSEQUENCES - A METAANALYSIS [J].
AMBADY, N ;
ROSENTHAL, R .
PSYCHOLOGICAL BULLETIN, 1992, 111 (02) :256-274
[3]  
[Anonymous], THESIS
[4]  
Bernhardt D, 2007, LECT NOTES COMPUT SC, V4738, P59
[5]   The recognition of human movement using temporal templates [J].
Bobick, AF ;
Davis, JW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) :257-267
[6]   Motion segmentation and pose recognition with motion history gradients [J].
Bradski, GR ;
Davis, JW .
MACHINE VISION AND APPLICATIONS, 2002, 13 (03) :174-184
[7]  
Chang C.-C., LIBSVM: a Library for Support Vector Machines
[8]  
Chen S., 2011, IEEE WIR OPT COMM C
[9]  
Chen S., 2011, IEEE INT C AUT FAC G
[10]   Facial expression recognition from video sequences: temporal and static modeling [J].
Cohen, I ;
Sebe, N ;
Garg, A ;
Chen, LS ;
Huang, TS .
COMPUTER VISION AND IMAGE UNDERSTANDING, 2003, 91 (1-2) :160-187