Performance of deer hunting optimization based deep learning algorithm for speech emotion recognition

被引:31
作者
Agarwal, Gaurav [1 ]
Om, Hari [1 ]
机构
[1] IIT ISM, Dept Comp Sci & Engn, Dhanbad 826004, Jharkhand, India
关键词
Speech emotion recognition; Adaptive wavelet transform; Modified galactic swarm optimization; Adaptive sunflower optimization algorithm; Optimized deep neural network; Deer hunting optimization algorithm; IDENTIFICATION; SYSTEM; VOICE;
D O I
10.1007/s11042-020-10118-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a speech emotion recognition technique based on Optimized Deep Neural Network. The speech signals are denoised by presenting a novel adaptive wavelet transform with a modified galactic swarm optimization algorithm (AWT_MGSO). From the noise removed speech signals, the spectral features like LPC (Linear Prediction Coefficients), MFCC (Mel frequency cepstral coefficients), PSD (power spectral density) and prosodic features like energy, entropy, formant frequencies and pitch are extracted and certain features are selected by ASFO (Adaptive Sunflower Optimization Algorithm). The optimized DNN-DHO (Deep Neural Network with Deer Hunting Optimization Algorithm) is proposed for emotion classification. An enhanced squirrel search algorithm is proposed to update the weight in the optimized DNN_DHO classifier. In this study, all the eight emotions of the speech from RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song) and TESS (Toronto Emotional Speech Set) databases for English and IITKGP-SEHSC (Indian Institute of Technology Kharagpur Simulated Emotion Hindi Speech Corpus) database for Hindi are classified. The experimental results are obtained and compared with the classifiers such as DNN_DHO, DNN (Deep Neural Network) and DAE (Deep Auto Encoder). The experimental results show that the proposed algorithm obtains maximum accuracy as 97.85% by the TESS dataset, 97.14% by the RAVDESS dataset and 93.75% by the IITKGP-SEHSC dataset by the DNN-HHO classifier.
引用
收藏
页码:9961 / 9992
页数:32
相关论文
共 50 条
  • [21] Optimizing Speech Emotion Recognition with Deep Learning and Grey Wolf Optimization: A Multi-Dataset Approach
    Tyagi, Suryakant
    Szenasi, Sandor
    ALGORITHMS, 2024, 17 (03)
  • [22] Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models
    Abbaschian, Babak Joze
    Sierra-Sosa, Daniel
    Elmaghraby, Adel
    SENSORS, 2021, 21 (04) : 1 - 27
  • [23] Exploring Speech Emotion Recognition in Tribal Language with Deep Learning Techniques
    Nayak, Subrat Kumar
    Nayak, Ajit Kumar
    Mishra, Smitaprava
    Mohanty, Prithviraj
    Tripathy, Nrusingha
    Chaudhury, Kumar Surjeet
    INTERNATIONAL JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING SYSTEMS, 2025, 16 (01) : 53 - 64
  • [24] Active Learning for Speech Emotion Recognition Using Deep Neural Network
    Abdelwahab, Mohammed
    Busso, Carlos
    2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,
  • [25] Navigating the Diverse Challenges of Speech Emotion Recognition: A Deep Learning Perspective
    Luo, Sandra J.
    PROCEEDINGS OF THE 27TH INTERNATIONAL ACADEMIC MINDTREK CONFERENCE, 2024, : 133 - 146
  • [26] Learning deep multimodal affective features for spontaneous speech emotion recognition
    Zhang, Shiqing
    Tao, Xin
    Chuang, Yuelong
    Zhao, Xiaoming
    SPEECH COMMUNICATION, 2021, 127 : 73 - 81
  • [27] A Review on Speech Emotion Recognition Using Deep Learning and Attention Mechanism
    Lieskovska, Eva
    Jakubec, Maros
    Jarina, Roman
    Chmulik, Michal
    ELECTRONICS, 2021, 10 (10)
  • [28] Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms
    Satt, Aharon
    Rozenberg, Shai
    Hoory, Ron
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 1089 - 1093
  • [29] Speech Emotion Recognition Based on Deep Residual Shrinkage Network
    Han, Tian
    Zhang, Zhu
    Ren, Mingyuan
    Dong, Changchun
    Jiang, Xiaolin
    Zhuang, Quansheng
    ELECTRONICS, 2023, 12 (11)
  • [30] Multi-Modal Emotion Recognition From Speech and Facial Expression Based on Deep Learning
    Cai, Linqin
    Dong, Jiangong
    Wei, Min
    2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, : 5726 - 5729