Emotion Recognition in Videos via Fusing Multimodal Features

被引:3
作者
Chen, Shizhe [1 ]
Dian, Yujie [1 ]
Li, Xinrui [1 ]
Lin, Xiaozhu [1 ]
Jin, Qin [1 ]
Liu, Haibo [2 ]
Lu, Li [2 ]
机构
[1] Renmin Univ China, Multimedia Comp Lab, Sch Informat, Beijing, Peoples R China
[2] Tencent Inc, Beijing, Peoples R China
来源
PATTERN RECOGNITION (CCPR 2016), PT II | 2016年 / 663卷
关键词
Emotion recognition; Multimodal features fusion; CNN Features;
D O I
10.1007/978-981-10-3005-5_52
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition is a challenging task with a wide range of applications. In this paper, we present our system in the CCPR 2016 multimodal emotion recognition challenge. Multimodal features from acoustic signals, facial expressions and speech contents are extracted to recognize the emotion of the character in the video. Among them the facial CNN feature is the most discriminative feature for emotion recognition. We train SVM and random forest classifiers based on each type of features and utilize early and late fusion to combine the different modality features. To deal with the data unbalance issue, we propose to adapt the probability thresholds for each emotion class. The macro precision of our best multimodal fusion system achieves 50.34% on the testing set, which significantly outperforms the baseline of 30.63 %.
引用
收藏
页码:632 / 644
页数:13
相关论文
共 36 条
[1]  
[Anonymous], 2016, CHIN C PATT REC CCPR
[2]  
[Anonymous], 2015, P AVEC15 BRISB AUSTR
[3]  
[Anonymous], 1995, CONVOLUTIONAL NETWOR
[4]  
[Anonymous], 2014, ICMI
[5]  
[Anonymous], 2010, P LREC 2010 WORKSHOP
[6]   A Study on Sentiment Computing and Classification of Sina Weibo with Word2vec [J].
Bai Xue ;
Chen Fu ;
Zhan Shaobin .
2014 IEEE INTERNATIONAL CONGRESS ON BIG DATA (BIGDATA CONGRESS), 2014, :358-363
[7]   Random forests [J].
Breiman, L .
MACHINE LEARNING, 2001, 45 (01) :5-32
[8]  
Burkhardt F, 2005, EUR C SPEECH COMM TE, DOI DOI 10.21437/INTERSPEECH.2005-446
[9]  
Chen SZ, 2014, 2014 9TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), P579, DOI 10.1109/ISCSLP.2014.6936664
[10]  
Csurka G., 2004, WORKSH STAT LEARN CO, V1, P1, DOI DOI 10.1234/12345678