Cascaded Fusion of Dynamic, Spatial, and Textural Feature Sets for Person-Independent Facial Emotion Recognition

被引:6
作者
Kaechele, Markus [1 ]
Schwenker, Friedhelm [1 ]
机构
[1] Univ Ulm, Inst Neural Informat Proc, D-89069 Ulm, Germany
来源
2014 22ND INTERNATIONAL CONFERENCE ON PATTERN RECOGNITION (ICPR) | 2014年
关键词
MODEL;
D O I
10.1109/ICPR.2014.797
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition from facial expressions is a highly demanding task, especially in everyday life scenarios. Different sources of artifacts have to be considered in order to successfully extract the intended emotional nuances of the face. The exact and robust detection and orientation of faces impeded by occlusions, inhomogeneous lighting and fast movements is only one difficulty. Another one is the question of selecting suitable features for the application at hand. In the literature, a vast body of different visual features grouped into dynamic, spatial and textural families, has been proposed. These features exhibit different advantages/disadvantages over each other due to their inherent structure, and thus capture complementary information, which is a promising vantage point for fusion architectures. To combine different feature sets and exploit their respective advantages, an adaptive multilevel fusion architecture is proposed. The cascaded approach integrates information on different levels and time scales using artificial neural networks for adaptive weighting of propagated intermediate results. The performance of the proposed architecture is analysed on the GEMEP-FERA corpus as well as on a novel dataset obtained from an unconstrained, spontaneuous human-computer interaction scenario. The obtained performance is superior to single channels and basic fusion techniques.
引用
收藏
页码:4660 / 4665
页数:6
相关论文
共 25 条
[1]  
[Anonymous], 2007, P 6 ACM INT C IM VID, DOI [DOI 10.1145/1282280.1282340, 10.1145/1282280.1282340]
[2]  
[Anonymous], J MULTIMODAL USER IN
[3]  
Banziger T., 2010, Blueprint for affective computing: A sourcebook, DOI DOI 10.1037/A0025827
[4]   The recognition of human movement using temporal templates [J].
Bobick, AF ;
Davis, JW .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2001, 23 (03) :257-267
[5]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893
[6]  
Glodek M, 2011, 12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, P2280
[7]  
Glodek M, 2012, INT C PATT RECOG, P1084
[8]  
Glodek M, 2011, LECT NOTES COMPUT SC, V6975, P359, DOI 10.1007/978-3-642-24571-8_47
[9]  
Hongying Meng, 2011, Proceedings 2011 IEEE International Conference on Automatic Face & Gesture Recognition (FG 2011), P854, DOI 10.1109/FG.2011.5771362
[10]  
Kachele Markus, 2014, 3rd International Conference on Pattern Recognition Applications and Methods (ICPRAM 2014). Proceedings, P671