Acoustic Emotion Recognition: A Benchmark Comparison of Performances

被引:161
作者
Schuller, Bjoern [1 ]
Vlasenko, Bogdan [2 ]
Eyben, Florian [1 ]
Rigoll, Gerhard [1 ]
Wendemuth, Andreas [2 ]
机构
[1] Tech Univ Munich, Inst Human Machine Commun, D-80333 Munich, Germany
[2] Otto Guericke Univ, IESK, Cognit Syst, Magdeburg, Germany
来源
2009 IEEE WORKSHOP ON AUTOMATIC SPEECH RECOGNITION & UNDERSTANDING (ASRU 2009) | 2009年
基金
芬兰科学院;
关键词
D O I
10.1109/ASRU.2009.5372886
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In the light of the first challenge on emotion recognition from speech we provide the largest-to-date benchmark comparison under equal conditions on nine standard corpora in the field using the two pre-dominant paradigms: modeling on a frame-level by means of Hidden Markov Models and suprasegmental modeling by systematic feature brute-forcing. Investigated corpora are the ABC, AVIC, DES, EMO-DB, eNTERFACE, SAL, SmartKom, SUSAS, and VAM databases. To provide better comparability among sets, we additionally cluster each database's emotions into binary valence and arousal discrimination tasks. In the result large differences are found among corpora that mostly stem from naturalistic emotions and spontaneous speech vs. more prototypical events. Further, supra-segmental modeling proves significantly beneficial on average when several classes are addressed at a time.
引用
收藏
页码:552 / +
页数:2
相关论文
共 19 条
[1]  
[Anonymous], LECT NOTES COMPUTER
[2]  
[Anonymous], DOCUMENTATION DANISH
[3]  
[Anonymous], P INT BRISB AUSTR
[4]  
[Anonymous], P ICASSP HON APR 15
[5]  
[Anonymous], P INT C SPEECH COMM
[6]  
[Anonymous], 2006, HTK BOOK V3 4
[7]  
[Anonymous], P INTERSPEECH
[8]  
[Anonymous], 2006, ICDEW 2006 P 22 INT
[9]  
[Anonymous], 2002, P WORKSH MULT RES MU
[10]  
Burkhardt F., 2005, P INTERSPEECH, DOI DOI 10.21437/INTERSPEECH.2005-446