Mixed Entropy Down-Sampling based Ensemble Learning for Speech Emotion Recognition

被引:0
|
作者
Xuan, Zhengji [1 ]
Li, Dongdong [1 ]
Wang, Zhe [1 ]
Yang, Hai [1 ]
机构
[1] East China Univ Sci & Technol, Dept Comp Sci & Engn, Shanghai, Peoples R China
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
关键词
speech emotion recognition; ensemble learning; deep learning; down-sampling; boosting;
D O I
10.1109/IJCNN54540.2023.10191917
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The strength of emotion at different positions in a speech is strong or weak, and the weak parts with unclear emotions will bring noise to the model. We propose a boosting ensemble learning method based on mixed entropy down-sampling to effectively select emotionally salient segments to improve the classifier's performance. An independent Convolutional Neural Network (CNN) model is trained in each iteration of ensemble learning. These CNN models form an ensemble classifier, which improves the generalization ability by synthesizing all the learned results of down-sampling, making emotion recognition more accurate. We also introduce the concept of Mixed Information Entropy (MIE), which consists of Emotional Certainty Entropy (ECE) and Structural Distribution Entropy (SDE). ECE measures the emotional confusion of segments, while SDE measures the stability of segments in deep feature space. During the iteration, the deep features are obtained from the last fully connected layer of the model and down-sampled according to the weighted sum of confidence and MIE. The selected segments with stronger emotions are used for the next iteration. Our method is 3.77% higher on WA and 2.37% higher on UA than the naive CNN model on the IEMOCAP dataset.
引用
收藏
页数:8
相关论文
共 50 条
  • [1] SPEECH EMOTION RECOGNITION WITH ENSEMBLE LEARNING METHODS
    Shih, Po-Yuan
    Chen, Chia-Ping
    Wu, Chung-Hsien
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2756 - 2760
  • [2] Effective MLP and CNN based ensemble learning for speech emotion recognition
    Middya A.I.
    Nag B.
    Roy S.
    Multimedia Tools and Applications, 2024, 83 (36) : 83963 - 83990
  • [3] Ensemble Learning with CNN-LSTM Combination for Speech Emotion Recognition
    Tanberk, Senem
    Tukel, Dilek Bilgin
    PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 39 - 47
  • [4] Peripheral Pulmonary Lesions Classification Using Endobronchial Ultrasonography Images Based on Bagging Ensemble Learning and Down-Sampling Technique
    Wang, Huitao
    Shikano, Kohei
    Nakajima, Takahiro
    Nomura, Yukihiro
    Nakaguchi, Toshiya
    APPLIED SCIENCES-BASEL, 2023, 13 (14):
  • [5] Ensemble deep learning with HuBERT for speech emotion recognition
    Yang, Janghoon
    2023 IEEE 17TH INTERNATIONAL CONFERENCE ON SEMANTIC COMPUTING, ICSC, 2023, : 153 - 154
  • [6] Efficient bimodal emotion recognition system based on speech/text embeddings and ensemble learning fusion
    Chakhtouna, Adil
    Sekkate, Sara
    Adib, Abdellah
    ANNALS OF TELECOMMUNICATIONS, 2025, : 379 - 399
  • [7] Neural network-based blended ensemble learning for speech emotion recognition
    Yalamanchili, Bhanusree
    Samayamantula, Srinivas Kumar
    Anne, Koteswara Rao
    MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2022, 33 (04) : 1323 - 1348
  • [8] Neural network-based blended ensemble learning for speech emotion recognition
    Bhanusree Yalamanchili
    Srinivas Kumar Samayamantula
    Koteswara Rao Anne
    Multidimensional Systems and Signal Processing, 2022, 33 : 1323 - 1348
  • [9] Multi-language: ensemble learning-based speech emotion recognition
    Sruthi, Anumula
    Kumar, Anumula Kalyan
    Dasari, Kishore
    Sivaramaiah, Yenugu
    Divya, Garikapati
    Kumar, Gunupudi Sai Chaitanya
    INTERNATIONAL JOURNAL OF DATA SCIENCE AND ANALYTICS, 2024, 19 (3) : 453 - 467
  • [10] A computationally efficient speech emotion recognition system employing machine learning classifiers and ensemble learning
    Aishwarya N.
    Kaur K.
    Seemakurthy K.
    International Journal of Speech Technology, 2024, 27 (1) : 239 - 254