Speech Emotion Recognition Using Multi-Layer Perceptron Classifier

被引:1
作者
Yuan, Xiaochen [1 ]
Wong, Wai Pang [1 ]
Lam, Chan Tong [1 ]
机构
[1] Macao Polytech Univ, Fac Sci Appl, Macau, Peoples R China
来源
2022 IEEE 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2022) | 2022年
关键词
speech emotion recognition; multi-layer perceptron classifier; mel-frequency cepstral coefficients; openSMILE Feature;
D O I
10.1109/ICICN56848.2022.10006474
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper proposes a speech emotion recognition approach using the Multi-Layer Perceptron Classifier (MLP Classifier). The Mel-Frequency Cepstral Coefficients feature and openSMILE feature are respectively extracted. With the extracted features, MLP Classifier is used to classify the speech emotion. The Berlin database which contains seven emotions: happiness, anger, anxiety, fear, boredom and disgust, is used to evaluate the performance of the proposed approach. Data augmentation are furtherly employed and experimental results show that the proposed approach achieves satisfied performances. Comparisons are conducted when with data augmentation and without data augmentation, and the results indicate better performance with data augmentation.
引用
收藏
页码:644 / 648
页数:5
相关论文
共 17 条
  • [1] Survey on speech emotion recognition: Features, classification schemes, and databases
    El Ayadi, Moataz
    Kamel, Mohamed S.
    Karray, Fakhri
    [J]. PATTERN RECOGNITION, 2011, 44 (03) : 572 - 587
  • [2] Eyben F., 2010, P 18 ACM INT C MULT, P1459, DOI DOI 10.1145/1873951.1874246
  • [3] Composite Feature Extraction for Speech Emotion Recognition
    Fu, Yangzhi
    Yuan, Xiaochen
    [J]. 2020 IEEE 23RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2020), 2020, : 72 - 77
  • [4] The role of voice quality in communicating emotion, mood and attitude
    Gobl, C
    Ní Chasaide, A
    [J]. SPEECH COMMUNICATION, 2003, 40 (1-2) : 189 - 212
  • [5] Han K, 2014, INTERSPEECH, P223
  • [6] Extreme learning machine: Theory and applications
    Huang, Guang-Bin
    Zhu, Qin-Yu
    Siew, Chee-Kheong
    [J]. NEUROCOMPUTING, 2006, 70 (1-3) : 489 - 501
  • [7] Speech emotion classification using combined neurogram and INTERSPEECH 2010 paralinguistic challenge features
    Jassim, Wissam A.
    Paramesran, Raveendran
    Harte, Naomi
    [J]. IET SIGNAL PROCESSING, 2017, 11 (05) : 587 - 595
  • [8] Toward detecting emotions in spoken dialogs
    Lee, CM
    Narayanan, SS
    [J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (02): : 293 - 303
  • [9] Logan B., 2000, Mel Frequency Cepstral Coefficients for Music Modeling
  • [10] McFee B., 2015, PROC 14 PYTHON SCI C, V8, P18