Speech emotion recognition and classification using hybrid deep CNN and BiLSTM model

被引：0

作者：

Swami Mishra

Nehal Bhatnagar

Prakasam P

Sureshkumar T. R

机构：

[1] Vellore Institute of Technology,School of Electronics Engineering

来源：

Multimedia Tools and Applications | 2024年 / 83卷

关键词：

Speech emotion recognition; Deep convolutional neural networks; LSTM; MFSC; Ensemble learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Accurate emotion detection from speech utterances has been a challenging and active research affair recently. Speech emotion recognition (SER) systems play an essential role in Human-machine interaction, virtual reality, emergency services, and many other real-time systems. It is an open-ended problem as subjects from different regions and lingual backgrounds convey emotions altogether differently. The conventional approach used low-level periodic features from audio samples like energy, pitch, etc., for classification but was not efficient enough to detect emotions accurately and not generalized. With the recent advancements in computer vision and neural networks extracting high-level features and more accurate recognition can be achieved. This study proposes an ensemble deep CNN + Bi-LSTM-based framework for speech emotion recognition and classification of seven different emotions. The paralinguistic log Mel-frequency spectral coefficients (MFSC) is used as a feature to train the proposed architecture. The proposed Hybrid model is validated with TESS and SAVEE datasets. Experimental results have indicated a classification accuracy of 96.36%. The proposed model is compared with existing models, proving the superiority of the proposed hybrid deep CNN and Bi-LSTM model.

引用

页码：37603 / 37620

页数：17

共 50 条

[1] Speech emotion recognition and classification using hybrid deep CNN and BiLSTM model
Mishra, Swami
Bhatnagar, Nehal
Prakasam, P.
Sureshkumar, T. R.
MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (13) : 37603 - 37620
[2] Hybrid CNN-BiLSTM architecture with multiple attention mechanisms to enhance speech emotion recognition
Poorna, S. S.
Menon, Vivek
Gopalan, Sundararaman
BIOMEDICAL SIGNAL PROCESSING AND CONTROL, 2025, 100
[3] Speech Emotion Recognition Using CNN
Huang, Zhengwei
Dong, Ming
Mao, Qirong
Zhan, Yongzhao
PROCEEDINGS OF THE 2014 ACM CONFERENCE ON MULTIMEDIA (MM'14), 2014, : 801 - 804
[4] Hybrid LSTM-Attention and CNN Model for Enhanced Speech Emotion Recognition
Makhmudov, Fazliddin
Kutlimuratov, Alpamis
Cho, Young-Im
APPLIED SCIENCES-BASEL, 2024, 14 (23):
[5] Human activity recognition using CNN-BiLSTM-LightGBM hybrid model
Sonmez, Seyma Nur
Dogru, Ibrahim Alper
Atacak, Ismail
Kilic, Kazim
32ND IEEE SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE, SIU 2024, 2024,
[6] A Novel CNN-BiLSTM-GRU Hybrid Deep Learning Model for Human Activity Recognition
Lalwani, Pooja
Ganeshan, R.
INTERNATIONAL JOURNAL OF COMPUTATIONAL INTELLIGENCE SYSTEMS, 2024, 17 (01)
[7] EEG-based emotion recognition using hybrid CNN and LSTM classification
Chakravarthi, Bhuvaneshwari
Ng, Sin-Chun
Ezilarasan, M. R.
Leung, Man-Fai
FRONTIERS IN COMPUTATIONAL NEUROSCIENCE, 2022, 16
[8] Speech-based emotion recognition using a hybrid RNN-CNN network
Ning, Jingtao
Zhang, Wenchuan
SIGNAL IMAGE AND VIDEO PROCESSING, 2025, 19 (01)
[9] A BiLSTM-Transformer and 2D CNN Architecture for Emotion Recognition from Speech
Kim, Sera
Lee, Seok-Pil
ELECTRONICS, 2023, 12 (19)
[10] A HYBRID CNN-BILSTM MODEL FOR DRUG NAMED ENTITY RECOGNITION
Fudholi, Dhomas Hatta
Nayoan, Royan Abida N.
Hidayatullah, Ahmad Fathan
Arianto, Dede Brahma
JOURNAL OF ENGINEERING SCIENCE AND TECHNOLOGY, 2022, 17 (01): : 730 - 744

← 1 2 3 4 5 →