Gender-Aware Speech Emotion Recognition in Multiple Languages

被引:0
|
作者
Nicolini, Marco [1 ]
Ntalampiras, Stavros [1 ]
机构
[1] Univ Milan, Dept Comp Sci, Milan, Italy
来源
PATTERN RECOGNITION APPLICATIONS AND METHODS, ICPRAM 2023 | 2024年 / 14547卷
关键词
Audio pattern recognition; Machine learning; Transfer learning; Convolutional neural network; YAMNet; Multilingual speech emotion recognition; CORPUS;
D O I
10.1007/978-3-031-54726-3_7
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents a solution for Speech Emotion Recognition (SER) in multilingual setting using a hierarchical approach. The approach involves two levels, the first level identifies the gender of the speaker, while the second level predicts their emotional state. We evaluate the performance of three classifiers of increasing complexity: k-NN, transfer learning based on YAMNet, and Bidirectional Long Short-Term Memory neural networks. The models were trained, validated, and tested on a dataset that includes the big-six emotions and was collected from well-known SER datasets representing six different languages. Our results indicate that there are differences in classification accuracy when considering all data versus only female or male data, across all classifiers. Interestingly, prior knowledge of the speaker's gender can improve the overall classification performance.
引用
收藏
页码:111 / 123
页数:13
相关论文
共 50 条
  • [21] Acoustic-Prosodic Recognition of Emotion in Speech
    Montenegro, Chuchi S.
    Maravillas, Elmer A.
    2015 INTERNATIONAL CONFERENCE ON HUMANOID, NANOTECHNOLOGY, INFORMATION TECHNOLOGY,COMMUNICATION AND CONTROL, ENVIRONMENT AND MANAGEMENT (HNICEM), 2015, : 527 - +
  • [22] Learning Transferable Features for Speech Emotion Recognition
    Marczewski, Alison
    Veloso, Adriano
    Ziviani, Nivio
    PROCEEDINGS OF THE THEMATIC WORKSHOPS OF ACM MULTIMEDIA 2017 (THEMATIC WORKSHOPS'17), 2017, : 529 - 536
  • [23] Machine Learning Approach for Emotion Recognition in Speech
    Gjoreski, Martin
    Gjoreski, Hristijan
    Kulakov, Andrea
    INFORMATICA-JOURNAL OF COMPUTING AND INFORMATICS, 2014, 38 (04): : 377 - 383
  • [24] On the Praxes and Politics of AI Speech Emotion Recognition
    Kang, Edward B.
    PROCEEDINGS OF THE 6TH ACM CONFERENCE ON FAIRNESS, ACCOUNTABILITY, AND TRANSPARENCY, FACCT 2023, 2023, : 455 - 466
  • [25] EmotionEdge: An Efficient Framework for Speech Emotion Recognition
    Wang, Haiyan
    Li, Yitong
    2024 IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE, WCNC 2024, 2024,
  • [26] A Path Signature Approach for Speech Emotion Recognition
    Wang, Bo
    Liakata, Maria
    Ni, Hao
    Lyons, Terry
    Nevado-Holgado, Alejo J.
    Saunders, Kate
    INTERSPEECH 2019, 2019, : 1661 - 1665
  • [27] SPEECH EMOTION RECOGNITION WITH COMPLEMENTARY ACOUSTIC REPRESENTATIONS
    Zhang, Xiaoming
    Zhang, Fan
    Cui, Xiaodong
    Zhang, Wei
    2022 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, SLT, 2022, : 846 - 852
  • [28] Speech Emotion Recognition Using Transfer Learning
    Song, Peng
    Jin, Yun
    Zhao, Li
    Xin, Minghai
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2014, E97D (09): : 2530 - 2532
  • [29] A novel decomposition-based architecture for multilingual speech emotion recognition
    Ravi
    Taran, Sachin
    NEURAL COMPUTING & APPLICATIONS, 2024, : 9347 - 9359
  • [30] A CONDITIONAL CYCLE EMOTION GAN FOR CROSS CORPUS SPEECH EMOTION RECOGNITION
    Su, Bo-Hao
    Lee, Chi-Chun
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 351 - 357