COMBINING FEATURE SELECTION AND REPRESENTATION FOR SPEECH EMOTION RECOGNITION

被引:0
作者
Han, Wenjing [1 ]
Ruan, Huabin [2 ]
Yu, Xiaojie [1 ]
Zhu, Xuan [1 ]
机构
[1] Samsung R&D Inst China Beijing SRC B, Language Comp Lab, Beijing, Peoples R China
[2] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China
来源
2016 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA & EXPO WORKSHOPS (ICMEW) | 2016年
关键词
speech emotion recognition; multiple kernel learning; denoising autoencoder; feature selection; feature representation;
D O I
暂无
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we propose a feature selection and representation combination method to generate discriminative features for speech emotion recognition. In feature selection stage, a Multiple Kernel Learning (MKL) based strategy is used to obtain the optimal feature subset. Specifically, features selected at least n times among 10-fold cross validation are collected to build a new feature subset named n-subset, then the n-subset resulting in the highest classification accuracy is viewed as the optimal one. In feature representation stage, the optimal feature subset is mapped to a hidden representation using a denoising autoencoder (DAE). The model parameters are learned by minimizing the squared error between the original and the reconstructed input. The hidden representation is then used as the final feature set in the MKL model for emotion recognition. Our experimental results show significant performance improvement compared to using the original features in both of the inner-corpus and cross-corpus scenarios.
引用
收藏
页数:5
相关论文
共 50 条
[41]   Speech emotion recognition with unsupervised feature learning [J].
Huang, Zheng-wei ;
Xue, Wen-tao ;
Mao, Qi-rong .
FRONTIERS OF INFORMATION TECHNOLOGY & ELECTRONIC ENGINEERING, 2015, 16 (05) :358-366
[42]   Speech emotion recognition with unsupervised feature learning [J].
Zheng-wei Huang ;
Wen-tao Xue ;
Qi-rong Mao .
Frontiers of Information Technology & Electronic Engineering, 2015, 16 :358-366
[43]   Discriminative Feature Learning for Speech Emotion Recognition [J].
Zhang, Yuying ;
Zou, Yuexian ;
Peng, Junyi ;
Luo, Danqing ;
Huang, Dongyan .
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2019: TEXT AND TIME SERIES, PT IV, 2019, 11730 :198-210
[44]   EESpectrum Feature Representations for Speech Emotion Recognition [J].
Zhao, Ziping ;
Zhao, Yiqin ;
Bao, Zhongtian ;
Wang, Haishuai ;
Zhang, Zixing ;
Li, Chao .
PROCEEDINGS OF THE JOINT WORKSHOP OF THE 4TH WORKSHOP ON AFFECTIVE SOCIAL MULTIMEDIA COMPUTING AND FIRST MULTI-MODAL AFFECTIVE COMPUTING OF LARGE-SCALE MULTIMEDIA DATA (ASMMC-MMAC'18), 2018, :27-33
[45]   Speech Emotion Recognition with Discriminative Feature Learning [J].
Zhou, Huan ;
Liu, Kai .
INTERSPEECH 2020, 2020, :4094-4097
[46]   Speech Emotion Recognition Based on Feature Fusion [J].
Shen, Qi ;
Chen, Guanggen ;
Chang, Lin .
PROCEEDINGS OF THE 2017 2ND INTERNATIONAL CONFERENCE ON MATERIALS SCIENCE, MACHINERY AND ENERGY ENGINEERING (MSMEE 2017), 2017, 123 :1071-1074
[47]   An optimal two stage feature selection for speech emotion recognition using acoustic features [J].
Kuchibhotla S. ;
Vankayalapati H.D. ;
Anne K.R. .
International Journal of Speech Technology, 2016, 19 (04) :657-667
[48]   Speaker independent feature selection for speech emotion recognition: A multi-task approach [J].
Kalhor, Elham ;
Bakhtiari, Behzad .
MULTIMEDIA TOOLS AND APPLICATIONS, 2021, 80 (06) :8127-8146
[49]   Unsupervised feature selection and NMF de-noising for robust Speech Emotion Recognition [J].
Bandela, Surekha Reddy ;
Kumar, T. Kishore .
APPLIED ACOUSTICS, 2021, 172 (172)
[50]   Speaker independent feature selection for speech emotion recognition: A multi-task approach [J].
Elham Kalhor ;
Behzad Bakhtiari .
Multimedia Tools and Applications, 2021, 80 :8127-8146