Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition

被引:31
作者
Agarwal, Gaurav [1 ]
Om, Hari [1 ]
机构
[1] IIT ISM, Dept Comp Sci & Engn, Dhanbad 826004, Jharkhand, India
关键词
Speech emotion recognition; Adaptive wavelet transform; Modified galactic swarm optimization; Adaptive sunflower optimization algorithm; Optimized deep neural network; Deer hunting optimization algorithm; IDENTIFICATION; SYSTEM; VOICE;
D O I
10.1007/s11042-020-10118-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a speech emotion recognition technique based on Optimized Deep Neural Network. The speech signals are denoised by presenting a novel adaptive wavelet transform with a modified galactic swarm optimization algorithm (AWT_MGSO). From the noise removed speech signals, the spectral features like LPC (Linear Prediction Coefficients), MFCC (Mel frequency cepstral coefficients), PSD (power spectral density) and prosodic features like energy, entropy, formant frequencies and pitch are extracted and certain features are selected by ASFO (Adaptive Sunflower Optimization Algorithm). The optimized DNN-DHO (Deep Neural Network with Deer Hunting Optimization Algorithm) is proposed for emotion classification. An enhanced squirrel search algorithm is proposed to update the weight in the optimized DNN_DHO classifier. In this study, all the eight emotions of the speech from RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) and TESS (Toronto Emotional Speech Set) databases for English and IITKGP-SEHSC (Indian Institute of Technology Kharagpur Simulated Emotion Hindi Speech Corpus) database for Hindi are classified. The experimental results are obtained and compared with the classifiers such as DNN_DHO, DNN (Deep Neural Network) and DAE (Deep Auto Encoder). The experimental results show that the proposed algorithm obtains maximum accuracy as 97.85% by the TESS dataset, 97.14% by the RAVDESS dataset and 93.75% by the IITKGP-SEHSC dataset by the DNN-HHO classifier.
引用
收藏
页码:9961 / 9992
页数:32
相关论文
共 50 条
  • [41] Speech Emotion Recognition Using Deep Learning Transfer Models and Explainable Techniques
    Kim, Tae-Wan
    Kwak, Keun-Chang
    APPLIED SCIENCES-BASEL, 2024, 14 (04):
  • [42] Speech emotion recognition using feature fusion: a hybrid approach to deep learning
    Khan, Waleed Akram
    ul Qudous, Hamad
    Farhan, Asma Ahmad
    MULTIMEDIA TOOLS AND APPLICATIONS, 2024, 83 (31) : 75557 - 75584
  • [43] Feature Learning via Deep Belief Network for Chinese Speech Emotion Recognition
    Zhang, Shiqing
    Zhao, Xiaoming
    Chuang, Yuelong
    Guo, Wenping
    Chen, Ying
    PATTERN RECOGNITION (CCPR 2016), PT II, 2016, 663 : 645 - 651
  • [44] Analysis of Deep Learning Architectures for Cross-corpus Speech Emotion Recognition
    Parry, Jack
    Palaz, Dimitri
    Clarke, Georgia
    Lecomte, Pauline
    Mead, Rebecca
    Berger, Michael
    Hofer, Gregor
    INTERSPEECH 2019, 2019, : 1656 - 1660
  • [45] Unsupervised Feature Learning for Speech Emotion Recognition Based on Autoencoder
    Ying, Yangwei
    Tu, Yuanwu
    Zhou, Hong
    ELECTRONICS, 2021, 10 (17)
  • [46] Deep scattering network for speech emotion recognition
    Singh, Premjeet
    Saha, Goutam
    Sahidullah, Md
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 131 - 135
  • [47] Speech emotion recognition based on emotion perception
    Liu, Gang
    Cai, Shifang
    Wang, Ce
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [48] Performance Improvement in Speech Based Emotion Recognition With DWT and ANOVA
    Shinde, Ashwini S.
    Patil, Vaishali V.
    COMMUNICATIONS IN MATHEMATICS AND APPLICATIONS, 2023, 14 (03): : 1189 - 1198
  • [49] Deep learning and SVM-based emotion recognition from Chinese speech for smart affective services
    Zhang, Weishan
    Zhao, Dehai
    Chai, Zhi
    Yang, Laurence T.
    Liu, Xin
    Gong, Faming
    Yang, Su
    SOFTWARE-PRACTICE & EXPERIENCE, 2017, 47 (08) : 1127 - 1138
  • [50] SPEECH EMOTION RECOGNITION WITH LOCAL-GLOBAL AWARE DEEP REPRESENTATION LEARNING
    Liu, Jiaxing
    Liu, Zhilei
    Wang, Longbiao
    Guo, Lili
    Dang, Jianwu
    2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7174 - 7178