Speech Emotion Recognition Using Multi-Layer Perceptron Classifier

被引：1

作者：

Yuan, Xiaochen ^{[1
]}

Wong, Wai Pang ^{[1
]}

Lam, Chan Tong ^{[1
]}

机构：

[1] Macao Polytech Univ, Fac Sci Appl, Macau, Peoples R China

来源：

2022 IEEE 10TH INTERNATIONAL CONFERENCE ON INFORMATION, COMMUNICATION AND NETWORKS (ICICN 2022) | 2022年

关键词：

speech emotion recognition; multi-layer perceptron classifier; mel-frequency cepstral coefficients; openSMILE Feature;

D O I：

10.1109/ICICN56848.2022.10006474

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes a speech emotion recognition approach using the Multi-Layer Perceptron Classifier (MLP Classifier). The Mel-Frequency Cepstral Coefficients feature and openSMILE feature are respectively extracted. With the extracted features, MLP Classifier is used to classify the speech emotion. The Berlin database which contains seven emotions: happiness, anger, anxiety, fear, boredom and disgust, is used to evaluate the performance of the proposed approach. Data augmentation are furtherly employed and experimental results show that the proposed approach achieves satisfied performances. Comparisons are conducted when with data augmentation and without data augmentation, and the results indicate better performance with data augmentation.

引用

页码：644 / 648

页数：5

共 17 条

[1] Survey on speech emotion recognition: Features, classification schemes, and databases
El Ayadi, Moataz
Kamel, Mohamed S.
Karray, Fakhri
[J]. PATTERN RECOGNITION, 2011, 44 (03) : 572 - 587
[2] Eyben F., 2010, P 18 ACM INT C MULT, P1459, DOI DOI 10.1145/1873951.1874246
[3] Composite Feature Extraction for Speech Emotion Recognition
Fu, Yangzhi
Yuan, Xiaochen
[J]. 2020 IEEE 23RD INTERNATIONAL CONFERENCE ON COMPUTATIONAL SCIENCE AND ENGINEERING (CSE 2020), 2020, : 72 - 77
[4] The role of voice quality in communicating emotion, mood and attitude
Gobl, C
Ní Chasaide, A
[J]. SPEECH COMMUNICATION, 2003, 40 (1-2) : 189 - 212
[5] Han K, 2014, INTERSPEECH, P223
[6] Extreme learning machine: Theory and applications
Huang, Guang-Bin
Zhu, Qin-Yu
Siew, Chee-Kheong
[J]. NEUROCOMPUTING, 2006, 70 (1-3) : 489 - 501
[7] Speech emotion classification using combined neurogram and INTERSPEECH 2010 paralinguistic challenge features
Jassim, Wissam A.
Paramesran, Raveendran
Harte, Naomi
[J]. IET SIGNAL PROCESSING, 2017, 11 (05) : 587 - 595
[8] Toward detecting emotions in spoken dialogs
Lee, CM
Narayanan, SS
[J]. IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2005, 13 (02): : 293 - 303
[9] Logan B., 2000, Mel Frequency Cepstral Coefficients for Music Modeling
[10] McFee B., 2015, PROC 14 PYTHON SCI C, V8, P18

← 1 2 →