A hybrid deep feature selection framework for emotion recognition from human speeches

被引:9
作者
Marik, Aritra [1 ]
Chattopadhyay, Soumitri [1 ]
Singh, Pawan Kumar [1 ]
机构
[1] Jadavpur Univ, Dept Informat Technol, Jadavpur Univ Second Campus,Plot 8,LB Block, Kolkata 700106, W Bengal, India
基金
英国科研创新办公室;
关键词
Speech emotion recognition; Deep learning; Feature selection; Fuzzy entropy & similarity measures; Whale optimization algorithm; ALGORITHMS;
D O I
10.1007/s11042-022-14052-y
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech Emotion Recognition (SER) is an active area of signal processing research that aims at identifying emotional states from audio speech signals. Applications of SER range from psychological diagnosis to human-computer interaction and as such, a robust framework is needed for accurate classification. To this end, we propose a two-stage hybrid deep feature selection (HDFS) framework that combines deep learning with automated feature engineering for emotion recognition from human speeches, which shines both in terms of accuracy and computational efficiency. Our pipeline extracts self-learned features using a customized Wide-ResNet-50-2 deep learning model from mel-pectrograms of raw audio signals, whose dimensionality is reduced using a hybrid deep feature selection algorithm that comprises a fuzzy entropy and similarity-based feature ranking method, followed by Whale optimization algorithm, which is a popular meta-heuristic optimization algorithm in literature. A k-nearest neighbor classifier is used to classify the optimized feature subset into the respective emotion classes. The proposed pipeline is evaluated on three publicly available SER datasets using a 5-fold cross-validation scheme, where it is found to outperform several state-of-the-art existing works in literature by significant margins thus, justifying the superiority and reliability of the proposed research. The source codes of the proposed method can be found at: .
引用
收藏
页码:11461 / 11487
页数:27
相关论文
共 71 条
[21]   Improved Binary Sailfish Optimizer Based on Adaptive <italic>&x03B2;</italic>-Hill Climbing for Feature Selection [J].
Ghosh, Kushal Kanti ;
Ahmed, Shameem ;
Singh, Pawan Kumar ;
Geem, Zong Woo ;
Sarkar, Ram .
IEEE ACCESS, 2020, 8 :83548-83560
[22]   Application of texture-based features for text non-text classification in printed document images with novel feature selection algorithm [J].
Ghosh, Soulib ;
Hassan, S. K. Khalid ;
Khan, Ali Hussain ;
Manna, Ankur ;
Bhowmik, Showmik ;
Sarkar, Ram .
SOFT COMPUTING, 2022, 26 (02) :891-909
[23]   CGA: a new feature selection model for visual human action recognition [J].
Guha, Ritam ;
Khan, Ali Hussain ;
Singh, Pawan Kumar ;
Sarkar, Ram ;
Bhattacharjee, Debotosh .
NEURAL COMPUTING & APPLICATIONS, 2021, 33 (10) :5267-5286
[24]   Introducing clustering based population in Binary Gravitational Search Algorithm for Feature Selection [J].
Guha, Ritam ;
Ghosh, Manosij ;
Chakrabarti, Akash ;
Sarkar, Ram ;
Mirjalili, Seyedali .
APPLIED SOFT COMPUTING, 2020, 93
[25]   3D CNN-Based Speech Emotion Recognition Using K-Means Clustering and Spectrograms [J].
Hajarolasvadi, Noushin ;
Demirel, Hasan .
ENTROPY, 2019, 21 (05)
[26]   Deep Residual Learning for Image Recognition [J].
He, Kaiming ;
Zhang, Xiangyu ;
Ren, Shaoqing ;
Sun, Jian .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :770-778
[27]   Speech Emotion Recognition by Late Fusion for Bidirectional Reservoir Computing With Random Projection [J].
Ibrahim, Hemin ;
Loo, Chu Kiong ;
Alnajjar, Fady .
IEEE ACCESS, 2021, 9 :122855-122871
[28]   Speech Emotion Recognition Using Clustering Based GA-Optimized Feature Set [J].
Kanwal, Sofia ;
Asghar, Sohail .
IEEE ACCESS, 2021, 9 :125830-125842
[29]  
Kennedy J., 1995, IEEE International Conference on Neural Networks, V4, P1942, DOI [DOI 10.1109/ICNN.1995.488968, DOI 10.1007/978-0-387-30164-8630, 10.1109/ICNN.1995.488968]
[30]   Speech Emotion Recognition Using Deep Learning Techniques: A Review [J].
Khalil, Ruhul Amin ;
Jones, Edward ;
Babar, Mohammad Inayatullah ;
Jan, Tariqullah ;
Zafar, Mohammad Haseeb ;
Alhussain, Thamer .
IEEE ACCESS, 2019, 7 :117327-117345