Mixed Entropy Down-Sampling based Ensemble Learning for Speech Emotion Recognition

被引：0

作者：

Xuan, Zhengji ^{[1
]}

Li, Dongdong ^{[1
]}

Wang, Zhe ^{[1
]}

Yang, Hai ^{[1
]}

机构：

[1] East China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai, Peoples R China

来源：

2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年

关键词：

speech emotion recognition; ensemble learning; deep learning; down-sampling; boosting;

D O I：

10.1109/IJCNN54540.2023.10191917

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The strength of emotion at different positions in a speech is strong or weak, and the weak parts with unclear emotions will bring noise to the model. We propose a boosting ensemble learning method based on mixed entropy down-sampling to effectively select emotionally salient segments to improve the classifier's performance. An independent Convolutional Neural Network (CNN) model is trained in each iteration of ensemble learning. These CNN models form an ensemble classifier, which improves the generalization ability by synthesizing all the learned results of down-sampling, making emotion recognition more accurate. We also introduce the concept of Mixed Information Entropy (MIE), which consists of Emotional Certainty Entropy (ECE) and Structural Distribution Entropy (SDE). ECE measures the emotional confusion of segments, while SDE measures the stability of segments in deep feature space. During the iteration, the deep features are obtained from the last fully connected layer of the model and down-sampled according to the weighted sum of confidence and MIE. The selected segments with stronger emotions are used for the next iteration. Our method is 3.77% higher on WA and 2.37% higher on UA than the naive CNN model on the IEMOCAP dataset.

引用

页数：8

共 50 条

[1] SPEECH EMOTION RECOGNITION WITH ENSEMBLE LEARNING METHODS
Shih, Po-Yuan
Chen, Chia-Ping
Wu, Chung-Hsien
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2756 - 2760
[2] Effective MLP and CNN based ensemble learning for speech emotion recognition
Middya A.I.
Nag B.
Roy S.
Multimedia Tools and Applications, 2024, 83 (36) : 83963 - 83990
[3] Ensemble Learning with CNN-LSTM Combination for Speech Emotion Recognition
Tanberk, Senem
Tukel, Dilek Bilgin
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 39 - 47
[4] Peripheral Pulmonary Lesions Classification Using Endobronchial Ultrasonography Images Based on Bagging Ensemble Learning and Down-Sampling Technique
Wang, Huitao
Shikano, Kohei
Nakajima, Takahiro
Nomura, Yukihiro
Nakaguchi, Toshiya
APPLIED SCIENCES-BASEL, 2023, 13 (14):
[5] Ensemble deep learning with HuBERT for speech emotion recognition
Yang, Janghoon
2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 153 - 154
[6] Efficient bimodal emotion recognition system based on speech/text embeddings and ensemble learning fusion
Chakhtouna, Adil
Sekkate, Sara
Adib, Abdellah
ANNALS OF TELECOMMUNICATIONS, 2025, : 379 - 399
[7] Neural network-based blended ensemble learning for speech emotion recognition
Yalamanchili, Bhanusree
Samayamantula, Srinivas Kumar
Anne, Koteswara Rao
MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2022, 33 (04) : 1323 - 1348
[8] Neural network-based blended ensemble learning for speech emotion recognition
Bhanusree Yalamanchili
Srinivas Kumar Samayamantula
Koteswara Rao Anne
Multidimensional Systems and Signal Processing, 2022, 33 : 1323 - 1348
[9] Multi-language: ensemble learning-based speech emotion recognition
Sruthi, Anumula
Kumar, Anumula Kalyan
Dasari, Kishore
Sivaramaiah, Yenugu
Divya, Garikapati
Kumar, Gunupudi Sai Chaitanya
INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 19 (3) : 453 - 467
[10] A computationally efficient speech emotion recognition system employing machine learning classifiers and ensemble learning
Aishwarya N.
Kaur K.
Seemakurthy K.
International Journal of Speech Technology, 2024, 27 (1) : 239 - 254

← 1 2 3 4 5 →