Optimized feature engineering for machine learning-based emotion recognition from human speech

被引：0

作者：

Thakur, Anuja ^{[1
]}

Kumar Dhull, Sanjeev ^{[1
]}

机构：

[1] Guru Jambheshwar Univ Sci & Technol, Dept EEE, Hisar 125001, Haryana, India

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2025年 / 19卷 / 08期

关键词：

Feature Engineering; Feature Selection; GA-T; Machine learning; Speech Emotion Recognition; Spotted Hyena Optimization;

D O I：

10.1007/s11760-025-04271-9

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

This paper introduces a novel framework for Speech Emotion Recognition (SER) through advanced feature selection (FS) using hybrid meta-heuristic algorithms, addressing persistent challenges in optimal feature choice and feature engineering applications. Despite progress in SER, selecting the most informative features remains complex, often limiting model effectiveness. Our approach leverages three new hybrid models, specifically designed by integrating the Genetic Algorithm using Tournament selection (GA-T) and Spotted Hyena Optimization (SHO) algorithms for superior feature optimization. The first model (GSHO-I) employs GA-T to refine the feature set generated by SHO, creating a robust filter that enhances feature relevance. The second model (GSHO-II), GA-T and SHO are independently executed to assess feature importance, with their individual importance scores averaged to create a consensus-based metric for feature selection. In the third model (GSHO-III), SHO optimizes the feature set generated by GA-T, creating a dynamic loop that maximizes feature diversity. Our approach utilizes a rich combination of spectral, prosodic, and Wavelet Scattering (WS) features to construct a comprehensive feature set that enhances SER model precision. We rigorously evaluate the models on two extensive SER datasets, EmoDB and SAVEE using Support Vector Machine (SVM), K-Nearest Neighbours (KNN) and neural network classifiers. The simulation analysis reveals that GSHO-II achieves significantly improved performance, particularly when all three features are combined for the SVM classifier. It is observed that GSHO-II achieves accuracies of 96.23% on EmoDB and 92.86% on SAVEE datasets using the SVM classifier, establishing the efficacy of our hybrid model in advancing SER accuracy.

引用

页数：9

共 27 条

[11] Spotted hyena optimizer: A novel bio-inspired based metaheuristic technique for engineering applications [J].

Dhiman, Gaurav ;

Kumar, Vijay .

ADVANCES IN ENGINEERING SOFTWARE, 2017, 114 :48-70

[12] Speech Emotion Recognition Using Local and Global Features [J].

Gao, Yuanbo ;

Li, Baobin ;

Wang, Ning ;

Zhu, Tingshao .

BRAIN INFORMATICS, BI 2017, 2017, 10654 :3-13

[13] Optimal feature selection for speech emotion recognition using enhanced cat swarm optimization algorithm [J].

Gomathy, M. .

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 24 (01) :155-163

[14] Enhancing K-nearest neighbor algorithm: a comprehensive review and performance analysis of modifications [J].

Halder, Rajib Kumar ;

Uddin, Mohammed Nasir ;

Uddin, Md. Ashraf ;

Aryal, Sunil ;

Khraisat, Ansam .

JOURNAL OF BIG DATA, 2024, 11 (01)

[15] Multi-Layer Hybrid Fuzzy Classification Based on SVM and Improved PSO for Speech Emotion Recognition [J].

Huang, Shihan ;

Dang, Hua ;

Jiang, Rongkun ;

Hao, Yue ;

Xue, Chengbo ;

Gu, Wei .

ELECTRONICS, 2021, 10 (23)

[16]

Jackson P., 2014, University of Surrey Technical Report

[17]

Jafarzadeh P, 2024, Arxiv, DOI [arXiv:2411.02964, 10.48550/arXiv.2411.02964, DOI 10.48550/ARXIV.2411.02964]

[18] Feature extraction algorithms to improve the speech emotion recognition rate [J].

Koduru, Anusha ;

Valiveti, Hima Bindu ;

Budati, Anil Kumar .

INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2020, 23 (01) :45-55

[19] Emotion recognition in speech signals using optimization based multi-SVNN classifier [J].

Mannepalli, Kasiprasad ;

Sastry, Panyam Narahari ;

Suman, Maloji .

JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (02) :384-397

[20] A hybrid deep feature selection framework for emotion recognition from human speeches [J].

Marik, Aritra ;

Chattopadhyay, Soumitri ;

Singh, Pawan Kumar .

MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 82 (08) :11461-11487

← 1 2 3 →