Bi-Feature Selection Deep Learning-Based Techniques for Speech Emotion Recognition

被引:1
作者
Akinpelu, Samson [1 ]
Viriri, Serestina [1 ]
机构
[1] Univ KwaZulu Natal, Sch Math Stat & Comp Sci, Durban, South Africa
来源
ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT I | 2025年 / 15046卷
关键词
Deep Learning; Speech Emotion Recognition; Deep Convolutional Neural Network; Feature Selection;
D O I
10.1007/978-3-031-77392-1_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition from speech utilizing digital computers has remained a challenging task. Eliciting the salient features that contribute to human emotion from speech signals is vital to human-computer interaction. Many deep learning approaches have been proposed for the recognition of emotion from speech with significant results, however, the combination of deep feature selection approaches on extracted features from speech signals for efficient recognition of emotion with low misclassification demands concrete attention. This paper proposes a deep learning technique using a lightweight efficient deep convolutional neural network to extract features from raw mel-spectrogram and Mel-Frequency Cepstral Coefficient (MFCC) features from speech signals. Thereafter, two parallel feature selection techniques were utilized on the extracted features, and a fusion strategy was applied, before eventual recognition using two best-performing classifiers. The proposed method was experimented on the TESS dataset, which yielded an accuracy of 99.5% and a specificity value of 98% for recognition of seven classes of emotion. The comparison of the method shows that it outperforms the state-of-the-art approach in pushing speech emotion recognition (SER) performance beyond its current boundaries.
引用
收藏
页码:345 / 356
页数:12
相关论文
共 46 条
[1]   The Effect of Feature Selection on the Accuracy of X-Platform User Credibility Detection with Supervised Machine Learning [J].
Abid-Althaqafi, Nahid R. ;
Alsalamah, Hessah A. .
ELECTRONICS, 2024, 13 (01)
[2]   An enhanced speech emotion recognition using vision transformer [J].
Akinpelu, Samson ;
Viriri, Serestina ;
Adegun, Adekanmi .
SCIENTIFIC REPORTS, 2024, 14 (01)
[3]   Speech emotion classification using attention based network and regularized feature selection [J].
Akinpelu, Samson ;
Viriri, Serestina .
SCIENTIFIC REPORTS, 2023, 13 (01)
[4]   A Robust Deep Transfer Learning Model for Accurate Speech Emotion Classification [J].
Akinpelu, Samson ;
Viriri, Serestina .
ADVANCES IN VISUAL COMPUTING, ISVC 2022, PT II, 2022, 13599 :419-430
[5]   Robust Feature Selection-Based Speech Emotion Classification Using Deep Transfer Learning [J].
Akinpelu, Samson ;
Viriri, Serestina .
APPLIED SCIENCES-BASEL, 2022, 12 (16)
[6]  
Azer M., 2021, I.J. Intelligent Systems and Applications, V3, P1, DOI DOI 10.5815/IJISA.2021.03.01
[7]   Speech Emotion Recognition Using Unsupervised Feature Selection Algorithms [J].
Bandela, Surekha Reddy ;
Kumar, T. Kishore .
RADIOENGINEERING, 2020, 29 (02) :353-364
[8]   CycleGAN-based Emotion Style Transfer as Data Augmentation for Speech Emotion Recognition [J].
Bao, Fang ;
Neumann, Michael ;
Ngoc Thang Vu .
INTERSPEECH 2019, 2019, :2828-2832
[9]  
Busso K., 2008, Comput. Speech Lang., V22, P438
[10]  
Chimthankar P, 2021, Speech emotion recognition using deep learning