Gender-Aware Speech Emotion Recognition in Multiple Languages

被引：0

作者：

Nicolini, Marco ^{[1
]}

Ntalampiras, Stavros ^{[1
]}

机构：

[1] Univ Milan, Dept Comp Sci, Milan, Italy

来源：

PATTERN RECOGNITION APPLICATIONS AND METHODS, ICPRAM 2023 | 2024年 / 14547卷

关键词：

Audio pattern recognition; Machine learning; Transfer learning; Convolutional neural network; YAMNet; Multilingual speech emotion recognition; CORPUS;

D O I：

10.1007/978-3-031-54726-3_7

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This article presents a solution for Speech Emotion Recognition (SER) in multilingual setting using a hierarchical approach. The approach involves two levels, the first level identifies the gender of the speaker, while the second level predicts their emotional state. We evaluate the performance of three classifiers of increasing complexity: k-NN, transfer learning based on YAMNet, and Bidirectional Long Short-Term Memory neural networks. The models were trained, validated, and tested on a dataset that includes the big-six emotions and was collected from well-known SER datasets representing six different languages. Our results indicate that there are differences in classification accuracy when considering all data versus only female or male data, across all classifiers. Interestingly, prior knowledge of the speaker's gender can improve the overall classification performance.

引用

页码：111 / 123

页数：13

共 50 条

[21] Acoustic-Prosodic Recognition of Emotion in Speech
Montenegro, Chuchi S.
Maravillas, Elmer A.
2015 INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY,COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2015, : 527 - +
[22] Learning Transferable Features for Speech Emotion Recognition
Marczewski, Alison
Veloso, Adriano
Ziviani, Nivio
PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 529 - 536
[23] Machine Learning Approach for Emotion Recognition in Speech
Gjoreski, Martin
Gjoreski, Hristijan
Kulakov, Andrea
INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2014, 38 (04): : 377 - 383
[24] On the Praxes and Politics of AI Speech Emotion Recognition
Kang, Edward B.
PROCEEDINGS OF THE 6TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2023, 2023, : 455 - 466
[25] EmotionEdge: An Efficient Framework for Speech Emotion Recognition
Wang, Haiyan
Li, Yitong
2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
[26] A Path Signature Approach for Speech Emotion Recognition
Wang, Bo
Liakata, Maria
Ni, Hao
Lyons, Terry
Nevado-Holgado, Alejo J.
Saunders, Kate
INTERSPEECH 2019, 2019, : 1661 - 1665
[27] SPEECH EMOTION RECOGNITION WITH COMPLEMENTARY ACOUSTIC REPRESENTATIONS
Zhang, Xiaoming
Zhang, Fan
Cui, Xiaodong
Zhang, Wei
2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 846 - 852
[28] Speech Emotion Recognition Using Transfer Learning
Song, Peng
Jin, Yun
Zhao, Li
Xin, Minghai
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (09): : 2530 - 2532
[29] A novel decomposition-based architecture for multilingual speech emotion recognition
Ravi
Taran, Sachin
NEURAL COMPUTING & APPLICATIONS, 2024, : 9347 - 9359
[30] A CONDITIONAL CYCLE EMOTION GAN FOR CROSS CORPUS SPEECH EMOTION RECOGNITION
Su, Bo-Hao
Lee, Chi-Chun
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 351 - 357

← 1 2 3 4 5 →