Classifier Subset Selection for the Stacked Generalization Method Applied to Emotion Recognition in Speech

被引:15
作者
Alvarez, Aitor [1 ]
Sierra, Basilio [2 ]
Arruti, Andoni [2 ]
Lopez-Gil, Juan-Miguel [2 ]
Garay-Vitoria, Nestor [2 ]
机构
[1] Vicomtech IK4, Human Speech & Language Technol Dept, Paseo Mikeletegi 57, Donostia San Sebastian 20009, Spain
[2] Univ Basque Country, UPV EHU, Paseo Manuel Lardizabal 1, Donostia San Sebastian 20018, Spain
关键词
affective computing; machine learning; speech emotion recognition; BAYESIAN NETWORKS; FEATURES;
D O I
10.3390/s16010021
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
In this paper, a new supervised classification paradigm, called classifier subset selection for stacked generalization (CSS stacking), is presented to deal with speech emotion recognition. The new approach consists of an improvement of a bi-level multi-classifier system known as stacking generalization by means of an integration of an estimation of distribution algorithm (EDA) in the first layer to select the optimal subset from the standard base classifiers. The good performance of the proposed new paradigm was demonstrated over different configurations and datasets. First, several CSS stacking classifiers were constructed on the RekEmozio dataset, using some specific standard base classifiers and a total of 123 spectral, quality and prosodic features computed using in-house feature extraction algorithms. These initial CSS stacking classifiers were compared to other multi-classifier systems and the employed standard classifiers built on the same set of speech features. Then, new CSS stacking classifiers were built on RekEmozio using a different set of both acoustic parameters (extended version of the Geneva Minimalistic Acoustic Parameter Set (eGeMAPS)) and standard classifiers and employing the best meta-classifier of the initial experiments. The performance of these two CSS stacking classifiers was evaluated and compared. Finally, the new paradigm was tested on the well-known Berlin Emotional Speech database. We compared the performance of single, standard stacking and CSS stacking systems using the same parametrization of the second phase. All of the classifications were performed at the categorical level, including the six primary emotions plus the neutral one.
引用
收藏
页数:26
相关论文
共 77 条
[41]   Feature subset selection by Bayesian networks:: a comparison with genetic and sequential algorithms [J].
Inza, I ;
Larrañaga, P ;
Sierra, B .
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2001, 27 (02) :143-164
[42]   Feature Subset Selection by Bayesian network-based optimization [J].
Inza, I ;
Larrañaga, P ;
Etxeberria, R ;
Sierra, B .
ARTIFICIAL INTELLIGENCE, 2000, 123 (1-2) :157-184
[43]  
Iriondo I., 2000, ISCA TUT RES WORKSH
[44]  
Kohavi R., 1996, KDD-96 Proceedings. Second International Conference on Knowledge Discovery and Data Mining, P202
[45]  
Kuang YL, 2012, INT CONF SOFTW ENG, P795
[46]   THE EMOTION PROBE - STUDIES OF MOTIVATION AND ATTENTION [J].
LANG, PJ .
AMERICAN PSYCHOLOGIST, 1995, 50 (05) :372-385
[47]   Emotion recognition using a hierarchical binary decision tree approach [J].
Lee, Chi-Chun ;
Mower, Emily ;
Busso, Carlos ;
Lee, Sungbok ;
Narayanan, Shrikanth .
SPEECH COMMUNICATION, 2011, 53 (9-10) :1162-1171
[48]  
López JM, 2007, LECT NOTES COMPUT SC, V4560, P422
[49]  
Lopez PB, 2009, BIBL HIST ARTE, P1
[50]   Classifier Subset Selection to construct multi-classifiers by means of estimation of distribution algorithms [J].
Mendialdua, Inigo ;
Arruti, Andoni ;
Jauregi, Ekaitz ;
Lazkano, Elena ;
Sierra, Basilio .
NEUROCOMPUTING, 2015, 157 :46-60