Video modeling and learning on Riemannian manifold for emotion recognition in the wild

被引:0
作者
Mengyi Liu
Ruiping Wang
Shaoxin Li
Zhiwu Huang
Shiguang Shan
Xilin Chen
机构
[1] Chinese Academy of Sciences (CAS),Key Laboratory of Intelligent Processing, Institute of Computing Technology, CAS
来源
Journal on Multimodal User Interfaces | 2016年 / 10卷
关键词
Emotion recognition; Video modeling; Riemannian manifold; EmotiW challenge;
D O I
暂无
中图分类号
学科分类号
摘要
In this paper, we present the method for our submission to the emotion recognition in the wild challenge (EmotiW). The challenge is to automatically classify the emotions acted by human subjects in video clips under real-world environment. In our method, each video clip can be represented by three types of image set models (i.e. linear subspace, covariance matrix, and Gaussian distribution) respectively, which can all be viewed as points residing on some Riemannian manifolds. Then different Riemannian kernels are employed on these set models correspondingly for similarity/distance measurement. For classification, three types of classifiers, i.e. kernel SVM, logistic regression, and partial least squares, are investigated for comparisons. Finally, an optimal fusion of classifiers learned from different kernels and different modalities (video and audio) is conducted at the decision level for further boosting the performance. We perform extensive evaluations on the EmotiW 2014 challenge data (including validation set and blind test set), and evaluate the effects of different components in our pipeline. It is observed that our method has achieved the best performance reported so far. To further evaluate the generalization ability, we also perform experiments on the EmotiW 2013 data and two well-known lab-controlled databases: CK+ and MMI. The results show that the proposed framework significantly outperforms the state-of-the-art methods.
引用
收藏
页码:113 / 124
页数:11
相关论文
共 183 条
[1]  
Arandjelovic O(2005)Face recognition with image sets using manifold density divergence IEEE Comput Vis Pattern Recognit 1 581-588
[2]  
Shakhnarovich G(2007)Geometric means in a novel vector space structure on symmetric positive-definite matrices SIAM J Matrix Anal Appl 29 328-347
[3]  
Fisher J(2011)Libsvm: a library for support vector machines ACM Trans Intell Syst Technol (TIST) 2 27-513
[4]  
Cipolla R(2014)Emotion recognition in the wild with feature fusion and multiple kernel learning ACM Int Conf Multimodal Interact 1 508-2561
[5]  
Darrell T(2012)Improved facial expression recognition via uni-hyperplane classification IEEE Comput Vis Pattern Recognit 1 2554-893
[6]  
Arsigny V(2005)Histograms of oriented gradients for human detection IEEE Comput Vis Pattern Recognit 1 886-883
[7]  
Fillard P(2011)Emotion recognition using phog and lpq features IEEE Autom Face Gesture Recognit 1 878-466
[8]  
Pennec X(2014)Emotion recognition in the wild challenge 2014: baseline, data and protocol ACM Int Conf Multimodal Interact 1 461-516
[9]  
Ayache N(2013)Emotion recognition in the wild challenge 2013 ACM Int Conf Multimodal Interact 1 509-41
[10]  
Chang C-C(2012)Collecting large, richly annotated facial-expression databases from movies IEEE MultiM 19 34-1462