Ensemble Learning with CNN-LSTM Combination for Speech Emotion Recognition

被引：0

作者：

Tanberk, Senem ^{[1
]}

Tukel, Dilek Bilgin ^{[2
]}

机构：

[1] Orion Innovat Turkey, Istanbul, Turkey

[2] Dogus Univ, Istanbul, Turkey

来源：

PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021) | 2022年 / 394卷

关键词：

Speech emotion recognition; Deep learning; Convolutional neural network; Long short-term memory; Ensemble learning;

D O I：

10.1007/978-981-19-0604-6_5

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Speech plays the most significant role in communication between people. The voice enables a speaker's unique characteristics to be mapped with biometric properties as well as carrying emotions. Emotion contains many non-linguistic signals to express ourselves as humans. Emotion recognition in human speech is a challenging task in different applications in fields such as healthcare, services, telecommunications, video conferencing, and human-computer interaction (HCI). Deep learning techniques are becoming a significant focus in recent research in the speech emotion recognition (SER) domain. In this paper, we present an ensemble learning approach based on various combinations of CNN and LSTM networks to address the limitations of the existing SER models. The proposed system is evaluated using the RAVDESS dataset. More specifically, the LSTM, CNN, and CNN and LSTM models achieved an accuracy rate of 0.64, 0.73, and 0.71, respectively. The simulation outcomes confirm that ensemble learning of the three deep model combinations contributes to the effectiveness of SER.

引用

页码：39 / 47

页数：9

共 50 条

[1] Fused CNN-LSTM deep learning emotion recognition model using electroencephalography signals
Ramzan, Munaza
Dawn, Suma
INTERNATIONAL JOURNAL OF NEUROSCIENCE, 2023, 133 (06) : 587 - 597
[2] Effective MLP and CNN based ensemble learning for speech emotion recognition
Middya A.I.
Nag B.
Roy S.
Multimedia Tools and Applications, 2024, 83 (36) : 83963 - 83990
[3] Attention guided 3D CNN-LSTM model for accurate speech based emotion recognition
Atila, Orhan
Sengur, Abdulkadir
APPLIED ACOUSTICS, 2021, 182
[4] CNN-LSTM for automatic emotion recognition using contactless photoplythesmographic signals
Mellouk, Wafa
Handouzi, Wahida
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2023, 85
[5] CNN and LSTM based ensemble learning for human emotion recognition using EEG recordings
Iyer, Abhishek
Das, Srimit Sritik
Teotia, Reva
Maheshwari, Shishir
Sharma, Rishi Raj
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (04) : 4883 - 4896
[6] CNN and LSTM based ensemble learning for human emotion recognition using EEG recordings
Abhishek Iyer
Srimit Sritik Das
Reva Teotia
Shishir Maheshwari
Rishi Raj Sharma
Multimedia Tools and Applications, 2023, 82 : 4883 - 4896
[7] An ensemble 1D-CNN-LSTM-GRU model with data augmentation for speech emotion recognition
Ahmed, Md. Rayhan
Islam, Salekul
Islam, A. K. M. Muzahidul
Shatabda, Swakkhar
EXPERT SYSTEMS WITH APPLICATIONS, 2023, 218
[8] Emotion Recognition from Facial Expression Using Hybrid CNN-LSTM Network
Mohana, M.
Subashini, P.
Krishnaveni, M.
INTERNATIONAL JOURNAL OF PATTERN RECOGNITION AND ARTIFICIAL INTELLIGENCE, 2023, 37 (08)
[9] SPEECH EMOTION RECOGNITION WITH ENSEMBLE LEARNING METHODS
Shih, Po-Yuan
Chen, Chia-Ping
Wu, Chung-Hsien
2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 2756 - 2760
[10] Learning Temporal Representation of Transaction Amount for Fraudulent Transaction Recognition using CNN, Stacked LSTM, and CNN-LSTM
Heryadi, Yaya
Warnars, Harco Leslie Hendric Spits
2017 IEEE INTERNATIONAL CONFERENCE ON CYBERNETICS AND COMPUTATIONAL INTELLIGENCE (CYBERNETICSCOM), 2017, : 84 - 89

← 1 2 3 4 5 →