Speech Emotion Recognition Based on Deep Residual Shrinkage Network

被引:10
|
作者
Han, Tian [1 ,2 ]
Zhang, Zhu [1 ,2 ]
Ren, Mingyuan [1 ]
Dong, Changchun [1 ]
Jiang, Xiaolin [1 ]
Zhuang, Quansheng [2 ]
机构
[1] Jinhua Adv Res Inst, Dept Artificial Intelligence, Jinhua 321013, Peoples R China
[2] Harbin Univ Sci & Technol, Sch Measurement & Commun Engn, Harbin 150080, Peoples R China
关键词
speech emotion recognition; mel-spectrogram; DRSN; BiGRU;
D O I
10.3390/electronics12112512
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech emotion recognition (SER) technology is significant for human-computer interaction, and this paper studies the features and modeling of SER. Mel-spectrogram is introduced and utilized as the feature of speech, and the theory and extraction process of mel-spectrogram are presented in detail. A deep residual shrinkage network with bi-directional gated recurrent unit (DRSN-BiGRU) is proposed in this paper, which is composed of convolution network, residual shrinkage network, bi-directional recurrent unit, and fully-connected network. Through the self-attention mechanism, DRSN-BiGRU can automatically ignore noisy information and improve the ability to learn effective features. Network optimization, verification experiment is carried out in three emotional datasets (CASIA, IEMOCAP, and MELD), and the accuracy of DRSN-BiGRU are 86.03%, 86.07%, and 70.57%, respectively. The results are also analyzed and compared with DCNN-LSTM, CNN-BiLSTM, and DRN-BiGRU, which verified the superior performance of DRSN-BiGRU.
引用
收藏
页数:22
相关论文
共 50 条
  • [1] Fish Recognition Based on Deep Residual Shrinkage Network
    Cheng, Long
    He, Chengwan
    2021 4TH INTERNATIONAL CONFERENCE ON ROBOTICS, CONTROL AND AUTOMATION ENGINEERING (RCAE 2021), 2021, : 36 - 39
  • [2] Speech Emotion Recognition Based on Deep Belief Network
    Shi, Peng
    2018 IEEE 15TH INTERNATIONAL CONFERENCE ON NETWORKING, SENSING AND CONTROL (ICNSC), 2018,
  • [3] Speech Emotion Recognition Based on Deep Neural Network
    Zhu, Zijiang
    Hu, Yi
    Li, Junshan
    Li, Jianjun
    Wang, Junhua
    BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 154 - 154
  • [4] Radar air target recognition based on deep residual shrinkage network
    Yin, Jianguo
    Sheng, Wen
    Jiang, Wei
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2024, 46 (09): : 3012 - 3018
  • [5] Deep scattering network for speech emotion recognition
    Singh, Premjeet
    Saha, Goutam
    Sahidullah, Md
    29TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO 2021), 2021, : 131 - 135
  • [6] Speech Expression Multimodal Emotion Recognition Based on Deep Belief Network
    Liu, Dong
    Chen, Longxi
    Wang, Zhiyong
    Diao, Guangqiang
    JOURNAL OF GRID COMPUTING, 2021, 19 (02)
  • [7] A Study of Deep Belief Network Based Chinese Speech Emotion Recognition
    Chen, Bu
    Yin, Qian
    Guo, Ping
    2014 TENTH INTERNATIONAL CONFERENCE ON COMPUTATIONAL INTELLIGENCE AND SECURITY (CIS), 2014, : 180 - 184
  • [8] Speech Expression Multimodal Emotion Recognition Based on Deep Belief Network
    Dong Liu
    Longxi Chen
    Zhiyong Wang
    Guangqiang Diao
    Journal of Grid Computing, 2021, 19
  • [9] A Research of Speech Emotion Recognition Based on Deep Belief Network and SVM
    Huang, Chenchen
    Gong, Wei
    Fu, Wenlong
    Feng, Dongyu
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2014, 2014
  • [10] Radar signal recognition method based on deep residual shrinkage attention network
    Cao P.
    Yang C.
    Chen Z.
    Wang L.
    Shi L.
    Xi Tong Gong Cheng Yu Dian Zi Ji Shu/Systems Engineering and Electronics, 2023, 45 (03): : 717 - 725