Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition

被引：31

作者：

Agarwal, Gaurav ^{[1
]}

Om, Hari ^{[1
]}

机构：

[1] IIT ISM, Dept Comp Sci & Engn, Dhanbad 826004, Jharkhand, India

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2021年 / 80卷 / 07期

关键词：

Speech emotion recognition; Adaptive wavelet transform; Modified galactic swarm optimization; Adaptive sunflower optimization algorithm; Optimized deep neural network; Deer hunting optimization algorithm; IDENTIFICATION; SYSTEM; VOICE;

D O I：

10.1007/s11042-020-10118-x

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes a speech emotion recognition technique based on Optimized Deep Neural Network. The speech signals are denoised by presenting a novel adaptive wavelet transform with a modified galactic swarm optimization algorithm (AWT_MGSO). From the noise removed speech signals, the spectral features like LPC (Linear Prediction Coefficients), MFCC (Mel frequency cepstral coefficients), PSD (power spectral density) and prosodic features like energy, entropy, formant frequencies and pitch are extracted and certain features are selected by ASFO (Adaptive Sunflower Optimization Algorithm). The optimized DNN-DHO (Deep Neural Network with Deer Hunting Optimization Algorithm) is proposed for emotion classification. An enhanced squirrel search algorithm is proposed to update the weight in the optimized DNN_DHO classifier. In this study, all the eight emotions of the speech from RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) and TESS (Toronto Emotional Speech Set) databases for English and IITKGP-SEHSC (Indian Institute of Technology Kharagpur Simulated Emotion Hindi Speech Corpus) database for Hindi are classified. The experimental results are obtained and compared with the classifiers such as DNN_DHO, DNN (Deep Neural Network) and DAE (Deep Auto Encoder). The experimental results show that the proposed algorithm obtains maximum accuracy as 97.85% by the TESS dataset, 97.14% by the RAVDESS dataset and 93.75% by the IITKGP-SEHSC dataset by the DNN-HHO classifier.

引用

页码：9961 / 9992

页数：32

共 50 条

[31] Bi-Feature Selection Deep Learning-Based Techniques for Speech Emotion Recognition
Akinpelu, Samson
Viriri, Serestina
ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT I, 2025, 15046 : 345 - 356
[32] Multi-Modal Fusion Emotion Recognition Method of Speech Expression Based on Deep Learning
Liu, Dong
Wang, Zhiyong
Wang, Lifeng
Chen, Longxi
FRONTIERS IN NEUROROBOTICS, 2021, 15
[33] BERIS: An mBERT-based Emotion Recognition Algorithm from Indian Speech
Mehra, Pramod
Verma, Shashi Kant
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
[34] Neural network-based blended ensemble learning for speech emotion recognition
Yalamanchili, Bhanusree
Samayamantula, Srinivas Kumar
Anne, Koteswara Rao
MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2022, 33 (04) : 1323 - 1348
[35] Transfer Learning for Speech Emotion Recognition
Han Zhijie
Zhao, Huijuan
Wang, Ruchuan
2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 96 - 99
[36] Speech Emotion Recognition Based on Sparse Transfer Learning Method
Song, Peng
Zheng, Wenming
Liang, Ruiyu
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (07) : 1409 - 1412
[37] Graph Learning Based Speaker Independent Speech Emotion Recognition
Xu, Xinzhou
Huang, Chengwei
Wu, Chen
Wang, Qingyun
Zhao, Li
ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2014, 14 (02) : 17 - 22
[38] Speech emotion recognition based on genetic algorithm-decision tree fusion of deep and acoustic features
Sun, Linhui
Li, Qiu
Fu, Sheng
Li, Pingan
ETRI JOURNAL, 2022, 44 (03) : 462 - 475
[39] Speech Emotion Recognition based on Multi-Task Learning
Zhao, Huijuan
Han Zhijie
Wang, Ruchuan
2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
[40] AESR: Speech Recognition With Speech Emotion Recogniting Learning
Han, RongQi
Liu, Xin
Zhang, Hui
MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 91 - 101

← 1 2 3 4 5 →