Advanced differential evolution for gender-aware English speech emotion recognition

被引:2
作者
Yue, Liya [1 ]
Hu, Pei [2 ]
Zhu, Jiulong [1 ]
机构
[1] Nanyang Inst Technol, Fanli Business Sch, Nanyang 473004, Peoples R China
[2] Nanyang Inst Technol, Sch Comp & Software, Nanyang 473004, Peoples R China
关键词
Emotion recognition; Gender; Differential evolution;
D O I
10.1038/s41598-024-68864-z
中图分类号
O [数理科学和化学]; P [天文学、地球科学]; Q [生物科学]; N [自然科学总论];
学科分类号
07 ; 0710 ; 09 ;
摘要
Speech emotion recognition (SER) technology involves feature extraction and prediction models. However, recognition efficiency tends to decrease because of gender differences and the large number of extracted features. Consequently, this paper introduces a SER system based on gender. First, gender and emotion features are extracted from speech signals to develop gender recognition and emotion classification models. Second, according to gender differences, distinct emotion recognition models are established for male and female speakers. The gender of speakers is determined before executing the corresponding emotion model. Third, the accuracy of these emotion models is enhanced by utilizing an advanced differential evolution algorithm (ADE) to select optimal features. ADE incorporates new difference vectors, mutation operators, and position learning, which effectively balance global and local searches. A new position repairing method is proposed to address gender differences. Finally, experiments on four English datasets demonstrate that ADE is superior to comparison algorithms in recognition accuracy, recall, precision, F1-score, the number of used features and execution time. The findings highlight the significance of gender in refining emotion models, while mel-frequency cepstral coefficients are important factors in gender differences.
引用
收藏
页数:11
相关论文
共 40 条
[1]  
Aggarwal G, 2019, PROCEEDINGS 2019 AMITY INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE (AICAI), P672, DOI [10.1109/aicai.2019.8701328, 10.1109/AICAI.2019.8701328]
[2]   Speech emotion classification using attention based network and regularized feature selection [J].
Akinpelu, Samson ;
Viriri, Serestina .
SCIENTIFIC REPORTS, 2023, 13 (01)
[3]   CREMA-D: Crowd-Sourced Emotional Multimodal Actors Dataset [J].
Cao, Houwei ;
Cooper, David G. ;
Keutmann, Michael K. ;
Gur, Ruben C. ;
Nenkova, Ani ;
Verma, Ragini .
IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2014, 5 (04) :377-390
[4]   GRaNN: feature selection with golden ratio-aided neural network for emotion, gender and speaker identification from voice signals [J].
Garain, Avishek ;
Ray, Biswarup ;
Giampaolo, Fabio ;
Velasquez, Juan D. ;
Singh, Pawan Kumar ;
Sarkar, Ram .
NEURAL COMPUTING & APPLICATIONS, 2022, 34 (17) :14463-14486
[5]   Linear Discriminant Differential Evolution for Feature Selection in Emotional Speech Recognition [J].
Gharsellaoui, Soumaya ;
Selouani, Sid Ahmed ;
Yakoub, Mohammed Sidi .
INTERSPEECH 2019, 2019, :3297-3301
[6]   Multiple individual guided differential evolution with time varying and feedback information-based control parameters [J].
Gupta, Shubham ;
Su, Rong .
KNOWLEDGE-BASED SYSTEMS, 2023, 259
[7]  
Hasija T., 2021, Journal of Physics: Conference Series, V1950
[8]   Prosodic Feature-Based Discriminatively Trained Low Resource Speech Recognition System [J].
Hasija, Taniya ;
Kadyan, Virender ;
Guleria, Kalpna ;
Alharbi, Abdullah ;
Alyami, Hashem ;
Goyal, Nitin .
SUSTAINABILITY, 2022, 14 (02)
[9]   Multi-surrogate assisted binary particle swarm optimization algorithm and its application for feature selection [J].
Hu, Pei ;
Pan, Jeng-Shyang ;
Chu, Shu-Chuan ;
Sun, Chaoli .
APPLIED SOFT COMPUTING, 2022, 121
[10]   A multimodal emotion recognition model integrating speech, video and MoCAP [J].
Jia, Ning ;
Zheng, Chunjun ;
Sun, Wei .
MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (22) :32265-32286