Exploring Multimodal Visual Features for Continuous Affect Recognition

被引:23
作者
Sun, Bo [1 ]
Cao, Siming [1 ]
Li, Liandong [1 ]
He, Jun [1 ]
Yu, Lejun [1 ]
机构
[1] Beijing Normal Univ, Coll Informat Sci & Technol, Beijing 100875, Peoples R China
来源
PROCEEDINGS OF THE 6TH INTERNATIONAL WORKSHOP ON AUDIO/VISUAL EMOTION CHALLENGE (AVEC'16) | 2016年
关键词
Continuous Emotion Recognition; CNN; Multimodal Features; SVR; Residual Network;
D O I
10.1145/2988257.2988270
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
This paper presents our work in the Emotion Sub-Challenge of the 6th Audio/Visual Emotion Challenge and Workshop (AVEC 2016), whose goal is to explore utilizing audio, visual and physiological signals to continuously predict the value of the emotion dimensions (arousal and valence). As visual features are very important in emotion recognition, we try a variety of handcrafted and deep visual features. For each video clip, besides the baseline features, we extract multi-scale Dense SIFT features (MSDF), and some types of Convolutional neural networks (CNNs) features to recognize the expression phases of the current frame. We train linear Support Vector Regression (SVR) for every kind of features on the RECOLA dataset. Multimodal fusion of these modalities is then performed with a multiple linear regression model. The final Concordance Correlation Coefficient (CCC) we gained on the development set are 0.824 for arousal, and 0.718 for valence; and on the test set are 0.683 for arousal and 0.642 for valence.
引用
收藏
页码:83 / 88
页数:6
相关论文
共 32 条
  • [11] [Anonymous], LSTM MODELING CONTIN
  • [12] [Anonymous], 2014, Proceedings of the 16th International Conference on Multimodal Interaction, DOI 10.1145/2663204.2666275
  • [13] [Anonymous], COMPUTER SCI
  • [14] Emotion and sociable humanoid robots
    Breazeal, C
    [J]. INTERNATIONAL JOURNAL OF HUMAN-COMPUTER STUDIES, 2003, 59 (1-2) : 119 - 155
  • [15] CORTES C, 1995, MACH LEARN, V20, P273, DOI 10.1023/A:1022627411411
  • [16] Ekman P., 1975, Unmasking the face: A guide to recognizing emotions from facial clues
  • [17] Fan RE, 2008, J MACH LEARN RES, V9, P1871
  • [18] The world of emotions is not two-dimensional
    Fontaine, Johnny R. J.
    Scherer, Klaus R.
    Roesch, Etienne B.
    Ellsworth, Phoebe C.
    [J]. PSYCHOLOGICAL SCIENCE, 2007, 18 (12) : 1050 - 1057
  • [19] Goodfellow Ian J., 2013, Neural Information Processing. 20th International Conference, ICONIP 2013. Proceedings: LNCS 8228, P117, DOI 10.1007/978-3-642-42051-1_16
  • [20] Gratch J., 2015, COMMUN ACM, V57, P56