Bi-Feature Selection Deep Learning-Based Techniques for Speech Emotion Recognition

被引:1
作者
Akinpelu, Samson [1 ]
Viriri, Serestina [1 ]
机构
[1] Univ KwaZulu Natal, Sch Math Stat & Comp Sci, Durban, South Africa
来源
ADVANCES IN VISUAL COMPUTING, ISVC 2024, PT I | 2025年 / 15046卷
关键词
Deep Learning; Speech Emotion Recognition; Deep Convolutional Neural Network; Feature Selection;
D O I
10.1007/978-3-031-77392-1_26
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition from speech utilizing digital computers has remained a challenging task. Eliciting the salient features that contribute to human emotion from speech signals is vital to human-computer interaction. Many deep learning approaches have been proposed for the recognition of emotion from speech with significant results, however, the combination of deep feature selection approaches on extracted features from speech signals for efficient recognition of emotion with low misclassification demands concrete attention. This paper proposes a deep learning technique using a lightweight efficient deep convolutional neural network to extract features from raw mel-spectrogram and Mel-Frequency Cepstral Coefficient (MFCC) features from speech signals. Thereafter, two parallel feature selection techniques were utilized on the extracted features, and a fusion strategy was applied, before eventual recognition using two best-performing classifiers. The proposed method was experimented on the TESS dataset, which yielded an accuracy of 99.5% and a specificity value of 98% for recognition of seven classes of emotion. The comparison of the method shows that it outperforms the state-of-the-art approach in pushing speech emotion recognition (SER) performance beyond its current boundaries.
引用
收藏
页码:345 / 356
页数:12
相关论文
共 46 条
[21]   Memento: An Emotion-driven Lifelogging System with Wearables [J].
Jiang, Shiqi ;
Li, Zhenjiang ;
Zhou, Pengfei ;
Li, Mo .
ACM TRANSACTIONS ON SENSOR NETWORKS, 2019, 15 (01)
[22]   Emotion classification from speech signal based on empirical mode decomposition and non-linear features Speech emotion recognition [J].
Krishnan, Palani Thanaraj ;
Alex Noel, Joseph Raj ;
Rajangam, Vijayarajan .
COMPLEX & INTELLIGENT SYSTEMS, 2021, 7 (04) :1919-1934
[23]   A Review on Speech Emotion Recognition Using Deep Learning and Attention Mechanism [J].
Lieskovska, Eva ;
Jakubec, Maros ;
Jarina, Roman ;
Chmulik, Michal .
ELECTRONICS, 2021, 10 (10)
[24]   Speech emotion recognition based on feature selection and extreme learning machine decision tree [J].
Liu, Zhen-Tao ;
Wu, Min ;
Cao, Wei-Hua ;
Mao, Jun-Wei ;
Xu, Jian-Ping ;
Tan, Guan-Zheng .
NEUROCOMPUTING, 2018, 273 :271-280
[25]   A Proposal for Multimodal Emotion Recognition Using Aural Transformers and Action Units on RAVDESS Dataset [J].
Luna-Jimenez, Cristina ;
Kleinlein, Ricardo ;
Griol, David ;
Callejas, Zoraida ;
Montero, Juan M. ;
Fernandez-Martinez, Fernando .
APPLIED SCIENCES-BASEL, 2022, 12 (01)
[26]  
Mar L., 2023, IEEE C COMP APPL ICC, P1, DOI [10.1109/ICCA51723.2023.10181375, DOI 10.1109/ICCA51723.2023.10181375]
[27]   Optimal feature selection based speech emotion recognition using two-stream deep convolutional neural network [J].
Mustaqeem ;
Kwon, Soonil .
INTERNATIONAL JOURNAL OF INTELLIGENT SYSTEMS, 2021, 36 (09) :5116-5135
[28]   A novel feature selection method for speech emotion recognition [J].
Ozseven, Turgut .
APPLIED ACOUSTICS, 2019, 146 :320-326
[29]   Performance Improvement of Speech Emotion Recognition Systems by Combining 1D CNN and LSTM with Data Augmentation [J].
Pan, Shing-Tai ;
Wu, Han-Jui .
ELECTRONICS, 2023, 12 (11)
[30]  
Praseetha VM., 2018, J Comput Sci, V14, P1577, DOI DOI 10.3844/JCSSP.2018.1577.1587