Bi-Feature Selection Deep Learning-Based Techniques for Speech Emotion Recognition

被引：1

作者：

Akinpelu, Samson ^{[1
]}

Viriri, Serestina ^{[1
]}

机构：

[1] Univ KwaZulu Natal, Sch Math Stat & Comp Sci, Durban, South Africa

来源：

ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT I | 2025年 / 15046卷

关键词：

Deep Learning; Speech Emotion Recognition; Deep Convolutional Neural Network; Feature Selection;

D O I：

10.1007/978-3-031-77392-1_26

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Emotion recognition from speech utilizing digital computers has remained a challenging task. Eliciting the salient features that contribute to human emotion from speech signals is vital to human-computer interaction. Many deep learning approaches have been proposed for the recognition of emotion from speech with significant results, however, the combination of deep feature selection approaches on extracted features from speech signals for efficient recognition of emotion with low misclassification demands concrete attention. This paper proposes a deep learning technique using a lightweight efficient deep convolutional neural network to extract features from raw mel-spectrogram and Mel-Frequency Cepstral Coefficient (MFCC) features from speech signals. Thereafter, two parallel feature selection techniques were utilized on the extracted features, and a fusion strategy was applied, before eventual recognition using two best-performing classifiers. The proposed method was experimented on the TESS dataset, which yielded an accuracy of 99.5% and a specificity value of 98% for recognition of seven classes of emotion. The comparison of the method shows that it outperforms the state-of-the-art approach in pushing speech emotion recognition (SER) performance beyond its current boundaries.

引用

页码：345 / 356

页数：12

共 46 条

[21] Memento: An Emotion-driven Lifelogging System with Wearables [J].