A DCRNN-based ensemble classifier for speech emotion recognition in Odia language

被引：0

作者：

Monorama Swain

Bubai Maji

P. Kabisatpathy

Aurobinda Routray

机构：

[1] Silicon Institute of Technology,Department of Electronics and Communication Engineering

[2] CV Raman College of Engineering,Department of Electronics and Instrumentation

[3] Indian Institute of Technology,Department of Electrical Engineering

来源：

Complex & Intelligent Systems | 2022年 / 8卷

关键词：

Speech emotion recognition; Deep convolutional neural network; Bi-directional gated recurrent unit; Ensemble classifier;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The Odia language is an old Eastern Indo-Aryan language, spoken by 46.8 million people across India. We have designed an ensemble classifier using Deep Convolutional Recurrent Neural Network for Speech Emotion Recognition (SER). This study presents a new approach for SER tasks motivated by recent research on speech emotion recognition. Initially, we extract utterance-level log Mel-spectrograms and their first and second derivative (Static, Delta, and Delta-delta), represented as 3-D log Mel-spectrograms. We utilize deep convolutional neural networks deep convolutional neural networks to extract the deep features from 3-D log Mel-spectrograms. Then a bi-directional-gated recurrent unit network is applied to express long-term temporal dependency out of all features to produce utterance-level emotion. Finally, we use ensemble classifiers using Softmax and Support Vector Machine classifier to improve the final recognition rate. In this way, our proposed framework is trained and tested on Odia (Seven emotional states) and RAVDESS (Eight emotional states) dataset. The experimental results reveal that an ensemble classifier performs better instead of a single classifier. The accuracy levels reached are 85.31% and 77.54%, outperforming some state-of-the-art frameworks on the Odia and RAVDESS datasets.

引用

页码：4237 / 4249

页数：12

共 50 条

[21] Mixed Entropy Down-Sampling based Ensemble Learning for Speech Emotion Recognition
Xuan, Zhengji
Li, Dongdong
Wang, Zhe
Yang, Hai
[J]. 2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN, 2023,
[22] Language dialect based speech emotion recognition through deep learning techniques
Sukumar Rajendran
Sandeep Kumar Mathivanan
Prabhu Jayagopal
Maheshwari Venkatasen
Thanapal Pandi
Manivannan Sorakaya Somanathan
Muthamilselvan Thangaval
Prasanna Mani
[J]. International Journal of Speech Technology, 2021, 24 : 625 - 635
[23] Multi-Classifier Interactive Learning for Ambiguous Speech Emotion Recognition
Zhou, Ying
Liang, Xuefeng
Gu, Yu
Yin, Yifei
Yao, Longshan
[J]. IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 695 - 705
[24] Speech Emotion Recognition Using Multi-Layer Perceptron Classifier
Yuan, Xiaochen
Wong, Wai Pang
Lam, Chan Tong
[J]. 2022 IEEE 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2022), 2022, : 644 - 648
[25] Language dialect based speech emotion recognition through deep learning techniques
Rajendran, Sukumar
Mathivanan, Sandeep Kumar
Jayagopal, Prabhu
Venkatasen, Maheshwari
Pandi, Thanapal
Sorakaya Somanathan, Manivannan
Thangaval, Muthamilselvan
Mani, Prasanna
[J]. INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (03) : 625 - 635
[26] Language-independent hyperparameter optimization based speech emotion recognition system
Thakur A.
Dhull S.K.
[J]. International Journal of Information Technology, 2022, 14 (7) : 3691 - 3699
[27] BanglaSER: A speech emotion recognition dataset for the Bangla language
Das, Rakesh Kumar
Islam, Nahidul
Ahmed, Md. Rayhan
Islam, Salekul
Shatabda, Swakkhar
Islam, A. K. M. Muzahidul
[J]. DATA IN BRIEF, 2022, 42
[28] Efficient bimodal emotion recognition system based on speech/text embeddings and ensemble learning fusion
Chakhtouna, Adil
Sekkate, Sara
Adib, Abdellah
[J]. ANNALS OF TELECOMMUNICATIONS, 2025, : 379 - 399
[29] Ensemble Learning with CNN-LSTM Combination for Speech Emotion Recognition
Tanberk, Senem
Tukel, Dilek Bilgin
[J]. PROCEEDINGS OF INTERNATIONAL CONFERENCE ON COMPUTING AND COMMUNICATION NETWORKS (ICCCN 2021), 2022, 394 : 39 - 47
[30] Ensemble of Students Taught by Probabilistic Teachers to Improve Speech Emotion Recognition
Sridhar, Kusha
Busso, Carlos
[J]. INTERSPEECH 2020, 2020, : 516 - 520

← 1 2 3 4 5 →