Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition

被引:31
作者
Agarwal, Gaurav [1 ]
Om, Hari [1 ]
机构
[1] IIT ISM, Dept Comp Sci & Engn, Dhanbad 826004, Jharkhand, India
关键词
Speech emotion recognition; Adaptive wavelet transform; Modified galactic swarm optimization; Adaptive sunflower optimization algorithm; Optimized deep neural network; Deer hunting optimization algorithm; IDENTIFICATION; SYSTEM; VOICE;
D O I
10.1007/s11042-020-10118-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a speech emotion recognition technique based on Optimized Deep Neural Network. The speech signals are denoised by presenting a novel adaptive wavelet transform with a modified galactic swarm optimization algorithm (AWT_MGSO). From the noise removed speech signals, the spectral features like LPC (Linear Prediction Coefficients), MFCC (Mel frequency cepstral coefficients), PSD (power spectral density) and prosodic features like energy, entropy, formant frequencies and pitch are extracted and certain features are selected by ASFO (Adaptive Sunflower Optimization Algorithm). The optimized DNN-DHO (Deep Neural Network with Deer Hunting Optimization Algorithm) is proposed for emotion classification. An enhanced squirrel search algorithm is proposed to update the weight in the optimized DNN_DHO classifier. In this study, all the eight emotions of the speech from RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) and TESS (Toronto Emotional Speech Set) databases for English and IITKGP-SEHSC (Indian Institute of Technology Kharagpur Simulated Emotion Hindi Speech Corpus) database for Hindi are classified. The experimental results are obtained and compared with the classifiers such as DNN_DHO, DNN (Deep Neural Network) and DAE (Deep Auto Encoder). The experimental results show that the proposed algorithm obtains maximum accuracy as 97.85% by the TESS dataset, 97.14% by the RAVDESS dataset and 93.75% by the IITKGP-SEHSC dataset by the DNN-HHO classifier.
引用
收藏
页码:9961 / 9992
页数:32
相关论文
共 50 条
  • [31] Bi-Feature Selection Deep Learning-Based Techniques for Speech Emotion Recognition
    Akinpelu, Samson
    Viriri, Serestina
    ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT I, 2025, 15046 : 345 - 356
  • [32] Multi-Modal Fusion Emotion Recognition Method of Speech Expression Based on Deep Learning
    Liu, Dong
    Wang, Zhiyong
    Wang, Lifeng
    Chen, Longxi
    FRONTIERS IN NEUROROBOTICS, 2021, 15
  • [33] BERIS: An mBERT-based Emotion Recognition Algorithm from Indian Speech
    Mehra, Pramod
    Verma, Shashi Kant
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2022, 21 (05)
  • [34] Neural network-based blended ensemble learning for speech emotion recognition
    Yalamanchili, Bhanusree
    Samayamantula, Srinivas Kumar
    Anne, Koteswara Rao
    MULTIDIMENSIONAL SYSTEMS AND SIGNAL PROCESSING, 2022, 33 (04) : 1323 - 1348
  • [35] Transfer Learning for Speech Emotion Recognition
    Han Zhijie
    Zhao, Huijuan
    Wang, Ruchuan
    2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 96 - 99
  • [36] Speech Emotion Recognition Based on Sparse Transfer Learning Method
    Song, Peng
    Zheng, Wenming
    Liang, Ruiyu
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2015, E98D (07) : 1409 - 1412
  • [37] Graph Learning Based Speaker Independent Speech Emotion Recognition
    Xu, Xinzhou
    Huang, Chengwei
    Wu, Chen
    Wang, Qingyun
    Zhao, Li
    ADVANCES IN ELECTRICAL AND COMPUTER ENGINEERING, 2014, 14 (02) : 17 - 22
  • [38] Speech emotion recognition based on genetic algorithm-decision tree fusion of deep and acoustic features
    Sun, Linhui
    Li, Qiu
    Fu, Sheng
    Li, Pingan
    ETRI JOURNAL, 2022, 44 (03) : 462 - 475
  • [39] Speech Emotion Recognition based on Multi-Task Learning
    Zhao, Huijuan
    Han Zhijie
    Wang, Ruchuan
    2019 IEEE 5TH INTL CONFERENCE ON BIG DATA SECURITY ON CLOUD (BIGDATASECURITY) / IEEE INTL CONFERENCE ON HIGH PERFORMANCE AND SMART COMPUTING (HPSC) / IEEE INTL CONFERENCE ON INTELLIGENT DATA AND SECURITY (IDS), 2019, : 186 - 188
  • [40] AESR: Speech Recognition With Speech Emotion Recogniting Learning
    Han, RongQi
    Liu, Xin
    Zhang, Hui
    MAN-MACHINE SPEECH COMMUNICATION, NCMMSC 2024, 2025, 2312 : 91 - 101