Research on speech emotion recognition algorithm for unbalanced data set

被引:0
|
作者
Liang Z. [1 ]
Li X. [1 ]
Song W. [1 ]
机构
[1] Electronic Information Engineering, Changchun University of Science and Technology, Jilin Province
来源
Journal of Intelligent and Fuzzy Systems | 2020年 / 39卷 / 03期
关键词
CRNN; focal loss; spectrograms; Speech emotion recognition;
D O I
10.3233/JIFS-191129
中图分类号
学科分类号
摘要
In speech emotion recognition, most emotional corpora generally have problems such as inconsistent sample length and imbalance of sample categories. Considering these problems, in this paper, a variable length input CRNN deep learning model based on Focal Loss is proposed for speech emotion recognition of anger, happiness, neutrality and sadness in IEMOCAP emotional corpus. In this model, Firstly, a variable-length strategy is introduced to input the speech spectra of the filled speech samples into CNN. Then the effective part of the input sequence is preserved and output by masking matrix and convolution layer. Thirdly, the effective output of input sequence is input into BiGRU network for learning. Finally, the focal loss is used for network training to control and adjust the contribution of various samples to the total loss. Compared with the traditional speech emotion recognition model, simulations show that our method can effectively improve the accuracy and performance of emotion recognition. © 2020 - IOS Press and the authors. All rights reserved.
引用
收藏
页码:2791 / 2796
页数:5
相关论文
共 50 条
  • [1] A New Network Structure for Speech Emotion Recognition Research
    Xu, Chunsheng
    Liu, Yunqing
    Song, Wenjun
    Liang, Zonglin
    Chen, Xing
    SENSORS, 2024, 24 (05)
  • [2] Speech emotion recognition based on rough set and SVM
    Zhou, Jian
    Wang, Guoyin
    Yang, Yong
    Chen, Peijun
    PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON COGNITIVE INFORMATICS, VOLS 1 AND 2, 2006, : 53 - 61
  • [3] Speech emotion recognition research: an analysis of research focus
    Mustafa, Mumtaz Begum
    Yusoof, Mansoor A. M.
    Don, Zuraidah M.
    Malekzadeh, Mehdi
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2018, 21 (01) : 137 - 156
  • [4] Emotion recognition from speech using deep learning on spectrograms
    Li, Xingguang
    Song, Wenjun
    Liang, Zonglin
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2020, 39 (03) : 2791 - 2796
  • [5] Research on Emergency Parking Instruction Recognition Based on Speech Recognition and Speech Emotion Recognition
    Tian Kexin
    Huang Yongming
    Zhang Guobao
    Zhang Lin
    2019 CHINESE AUTOMATION CONGRESS (CAC2019), 2019, : 2933 - 2937
  • [6] Research on Mandarin Chinese in Speech Emotion Recognition
    Wang, Ziyun
    Guo, Xiao
    2022 5TH INTERNATIONAL CONFERENCE ON MACHINE LEARNING AND NATURAL LANGUAGE PROCESSING, MLNLP 2022, 2022, : 99 - 103
  • [7] Effects of Data Augmentations on Speech Emotion Recognition
    Atmaja, Bagus Tris
    Sasou, Akira
    SENSORS, 2022, 22 (16)
  • [8] Speech emotion recognition using data augmentation
    V. M. Praseetha
    P. P. Joby
    International Journal of Speech Technology, 2022, 25 : 783 - 792
  • [9] Speech Emotion Recognition Using Data Augmentation
    Kapoor, Tanisha
    Ganguly, Arnaja
    Rajeswari, D.
    2024 INTERNATIONAL CONFERENCE ON ADVANCES IN COMPUTING, COMMUNICATION AND APPLIED INFORMATICS, ACCAI 2024, 2024,
  • [10] Speech emotion recognition using data augmentation
    Praseetha, V. M.
    Joby, P. P.
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 25 (4) : 783 - 792