This paper introduces a novel framework for Speech Emotion Recognition (SER) through advanced feature selection (FS) using hybrid meta-heuristic algorithms, addressing persistent challenges in optimal feature choice and feature engineering applications. Despite progress in SER, selecting the most informative features remains complex, often limiting model effectiveness. Our approach leverages three new hybrid models, specifically designed by integrating the Genetic Algorithm using Tournament selection (GA-T) and Spotted Hyena Optimization (SHO) algorithms for superior feature optimization. The first model (GSHO-I) employs GA-T to refine the feature set generated by SHO, creating a robust filter that enhances feature relevance. The second model (GSHO-II), GA-T and SHO are independently executed to assess feature importance, with their individual importance scores averaged to create a consensus-based metric for feature selection. In the third model (GSHO-III), SHO optimizes the feature set generated by GA-T, creating a dynamic loop that maximizes feature diversity. Our approach utilizes a rich combination of spectral, prosodic, and Wavelet Scattering (WS) features to construct a comprehensive feature set that enhances SER model precision. We rigorously evaluate the models on two extensive SER datasets, EmoDB and SAVEE using Support Vector Machine (SVM), K-Nearest Neighbours (KNN) and neural network classifiers. The simulation analysis reveals that GSHO-II achieves significantly improved performance, particularly when all three features are combined for the SVM classifier. It is observed that GSHO-II achieves accuracies of 96.23% on EmoDB and 92.86% on SAVEE datasets using the SVM classifier, establishing the efficacy of our hybrid model in advancing SER accuracy.