Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition

被引：31

作者：

Agarwal, Gaurav ^{[1
]}

Om, Hari ^{[1
]}

机构：

[1] IIT ISM, Dept Comp Sci & Engn, Dhanbad 826004, Jharkhand, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2021年 / 80卷 / 07期

关键词：

Speech emotion recognition; Adaptive wavelet transform; Modified galactic swarm optimization; Adaptive sunflower optimization algorithm; Optimized deep neural network; Deer hunting optimization algorithm; IDENTIFICATION; SYSTEM; VOICE;

D O I：

10.1007/s11042-020-10118-x

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes a speech emotion recognition technique based on Optimized Deep Neural Network. The speech signals are denoised by presenting a novel adaptive wavelet transform with a modified galactic swarm optimization algorithm (AWT_MGSO). From the noise removed speech signals, the spectral features like LPC (Linear Prediction Coefficients), MFCC (Mel frequency cepstral coefficients), PSD (power spectral density) and prosodic features like energy, entropy, formant frequencies and pitch are extracted and certain features are selected by ASFO (Adaptive Sunflower Optimization Algorithm). The optimized DNN-DHO (Deep Neural Network with Deer Hunting Optimization Algorithm) is proposed for emotion classification. An enhanced squirrel search algorithm is proposed to update the weight in the optimized DNN_DHO classifier. In this study, all the eight emotions of the speech from RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) and TESS (Toronto Emotional Speech Set) databases for English and IITKGP-SEHSC (Indian Institute of Technology Kharagpur Simulated Emotion Hindi Speech Corpus) database for Hindi are classified. The experimental results are obtained and compared with the classifiers such as DNN_DHO, DNN (Deep Neural Network) and DAE (Deep Auto Encoder). The experimental results show that the proposed algorithm obtains maximum accuracy as 97.85% by the TESS dataset, 97.14% by the RAVDESS dataset and 93.75% by the IITKGP-SEHSC dataset by the DNN-HHO classifier.

引用

页码：9961 / 9992

页数：32

共 50 条

[21] Optimizing Speech Emotion Recognition with Deep Learning and Grey Wolf Optimization: A Multi-Dataset Approach
Tyagi, Suryakant
Szenasi, Sandor
ALGORITHMS, 2024, 17 (03)
[22] Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models
Abbaschian, Babak Joze
Sierra-Sosa, Daniel
Elmaghraby, Adel
SENSORS, 2021, 21 (04) : 1 - 27
[23] Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques
Nayak, Subrat Kumar
Nayak, Ajit Kumar
Mishra, Smitaprava
Mohanty, Prithviraj
Tripathy, Nrusingha
Chaudhury, Kumar Surjeet
INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2025, 16 (01) : 53 - 64
[24] Active Learning for Speech Emotion Recognition Using Deep Neural Network
Abdelwahab, Mohammed
Busso, Carlos
2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
[25] Navigating the Diverse Challenges of Speech Emotion Recognition: A Deep Learning Perspective
Luo, Sandra J.
PROCEEDINGS OF THE 27TH INTERNATIONAL ACADEMIC MINDTREK CONFERENCE, 2024, : 133 - 146
[26] Learning deep multimodal affective features for spontaneous speech emotion recognition
Zhang, Shiqing
Tao, Xin
Chuang, Yuelong
Zhao, Xiaoming
SPEECH COMMUNICATION, 2021, 127 : 73 - 81
[27] A Review on Speech Emotion Recognition Using Deep Learning and Attention Mechanism
Lieskovska, Eva
Jakubec, Maros
Jarina, Roman
Chmulik, Michal
ELECTRONICS, 2021, 10 (10)
[28] Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms
Satt, Aharon
Rozenberg, Shai
Hoory, Ron
18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1089 - 1093
[29] Speech Emotion Recognition Based on Deep Residual Shrinkage Network
Han, Tian
Zhang, Zhu
Ren, Mingyuan
Dong, Changchun
Jiang, Xiaolin
Zhuang, Quansheng
ELECTRONICS, 2023, 12 (11)
[30] Multi-Modal Emotion Recognition From Speech and Facial Expression Based on Deep Learning
Cai, Linqin
Dong, Jiangong
Wei, Min
2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 5726 - 5729

← 1 2 3 4 5 →