Ensemble Learning with CNN-LSTM Combination for Speech Emotion Recognition

被引:0
|
作者
Tanberk, Senem [1 ]
Tukel, Dilek Bilgin [2 ]
机构
[1] Orion Innovat Turkey, Istanbul, Turkey
[2] Dogus Univ, Istanbul, Turkey
来源
PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021) | 2022年 / 394卷
关键词
Speech emotion recognition; Deep learning; Convolutional neural network; Long short-term memory; Ensemble learning;
D O I
10.1007/978-981-19-0604-6_5
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Speech plays the most significant role in communication between people. The voice enables a speaker's unique characteristics to be mapped with biometric properties as well as carrying emotions. Emotion contains many non-linguistic signals to express ourselves as humans. Emotion recognition in human speech is a challenging task in different applications in fields such as healthcare, services, telecommunications, video conferencing, and human-computer interaction (HCI). Deep learning techniques are becoming a significant focus in recent research in the speech emotion recognition (SER) domain. In this paper, we present an ensemble learning approach based on various combinations of CNN and LSTM networks to address the limitations of the existing SER models. The proposed system is evaluated using the RAVDESS dataset. More specifically, the LSTM, CNN, and CNN and LSTM models achieved an accuracy rate of 0.64, 0.73, and 0.71, respectively. The simulation outcomes confirm that ensemble learning of the three deep model combinations contributes to the effectiveness of SER.
引用
收藏
页码:39 / 47
页数:9
相关论文
共 50 条
  • [1] Fused CNN-LSTM deep learning emotion recognition model using electroencephalography signals
    Ramzan, Munaza
    Dawn, Suma
    INTERNATIONAL JOURNAL OF NEUROSCIENCE, 2023, 133 (06) : 587 - 597
  • [2] Effective MLP and CNN based ensemble learning for speech emotion recognition
    Middya A.I.
    Nag B.
    Roy S.
    Multimedia Tools and Applications, 2024, 83 (36) : 83963 - 83990
  • [3] Attention guided 3D CNN-LSTM model for accurate speech based emotion recognition
    Atila, Orhan
    Sengur, Abdulkadir
    APPLIED ACOUSTICS, 2021, 182
  • [4] CNN-LSTM for automatic emotion recognition using contactless photoplythesmographic signals
    Mellouk, Wafa
    Handouzi, Wahida
    BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 85
  • [5] CNN and LSTM based ensemble learning for human emotion recognition using EEG recordings
    Iyer, Abhishek
    Das, Srimit Sritik
    Teotia, Reva
    Maheshwari, Shishir
    Sharma, Rishi Raj
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (04) : 4883 - 4896
  • [6] CNN and LSTM based ensemble learning for human emotion recognition using EEG recordings
    Abhishek Iyer
    Srimit Sritik Das
    Reva Teotia
    Shishir Maheshwari
    Rishi Raj Sharma
    Multimedia Tools and Applications, 2023, 82 : 4883 - 4896
  • [7] An ensemble 1D-CNN-LSTM-GRU model with data augmentation for speech emotion recognition
    Ahmed, Md. Rayhan
    Islam, Salekul
    Islam, A. K. M. Muzahidul
    Shatabda, Swakkhar
    EXPERT SYSTEMS WITH APPLICATIONS, 2023, 218
  • [8] Emotion Recognition from Facial Expression Using Hybrid CNN-LSTM Network
    Mohana, M.
    Subashini, P.
    Krishnaveni, M.
    INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (08)
  • [9] SPEECH EMOTION RECOGNITION WITH ENSEMBLE LEARNING METHODS
    Shih, Po-Yuan
    Chen, Chia-Ping
    Wu, Chung-Hsien
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2756 - 2760
  • [10] Learning Temporal Representation of Transaction Amount for Fraudulent Transaction Recognition using CNN, Stacked LSTM, and CNN-LSTM
    Heryadi, Yaya
    Warnars, Harco Leslie Hendric Spits
    2017 IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND COMPUTATIONAL INTELLIGENCE (CYBERNETICSCOM), 2017, : 84 - 89