Language-independent hyperparameter optimization based speech emotion recognition system

被引:7
作者
Thakur A. [1 ]
Dhull S.K. [1 ]
机构
[1] Guru Jambheshwar University of Science and Technology, Hisar
关键词
Feature extraction; Hyperparameter optimization; Machine learning; Speech emotion recognition;
D O I
10.1007/s41870-022-00996-9
中图分类号
学科分类号
摘要
Speech emotion recognition is challenging due to substantially overlapping regions of emotions. Extracting desired features that influence emotions in a speech and categorizing these emotions is a tedious task. We intend to develop an effective and robust speech emotion recognition system capable of classifying ambiguous and overlapping emotions through this manuscript. Three feature sets Spectral, Prosodic, and Discrete Wavelet Transform are extracted and further processed to reduce the required combination of features. The use of hyper-parameter optimization in the machine learning model has been done to tune the support vector machine classifier parameter for the Speech emotion recognition system. The suggested model is also verified with two different language datasets: ‘SAVEE’ and ‘EmoDB’ resulting in a language-independent emotion recognition system from speech. The performance result achieved by employing the proposed technique in EmoDB with 535 samples and SAVEE with 480 samples in seven different emotion types is 90.02% and 71.66%, respectively. © 2022, The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management.
引用
收藏
页码:3691 / 3699
页数:8
相关论文
共 29 条
  • [1] Lanjewar R.B., Mathurkar S., Patel N., Implementation and comparison of speech emotion recognition system using gaussian mixture model (GMM) and K-nearest neighbor (K-NN) techniques, Proc Comput Sci, 49, pp. 50-57, (2015)
  • [2] Noroozi F., Sapinski T., Kaminska D., Anbarjafari G., Vocal-based emotion recognition using random forests and decision tree, Int J Speech Technol, 20, pp. 239-246, (2017)
  • [3] Busso C., Deng Z., Yildirim S., Et al., Analysis of emotion recognition using facial expressions, speech and multimodal information, ICMI’04—Sixth International Conference on Multimodal Interfaces, pp. 205-211, (2004)
  • [4] Tarnowski P., Kolodziej M., Majkowski A., Rak R.J., Emotion recognition using facial expressions, Proc Comput Sci, 108, pp. 1175-1184, (2017)
  • [5] Balamurali R., Lall P.B., Taneja K., Krishna G., Detecting human emotions through physiological signals using machine learning, Lect Notes Electr Eng, 806, pp. 587-602, (2022)
  • [6] Habibi SAH (2018) Emotion recognition with machine learning using EEG signals, 2018 25Th Iran Conf Biomed Eng 2018 3Rd Int Iran Conf Biomed Eng ICBME, (2018)
  • [7] Gouizi K., Maaoui C., Bereksi Reguig F., Negative emotion detection using EMG signal, 2014 International Conference on Control, Decision and Information Technologies (Codit). IEEE, pp. 690-695, (2014)
  • [8] Xu Y., Liu G., Hao M., Et al., Analysis of affective ECG signals toward emotion recognition, J Electron, 27, pp. 8-14, (2010)
  • [9] Yang L., Shami A., On hyperparameter optimization of machine learning algorithms: theory and practice, Neurocomputing, 415, pp. 295-316, (2020)
  • [10] Khan A., Roy U.K., Emotion Recognition Using Prosodie and Spectral Features of Speech and Naïve Bayes Classifier, pp. 1017-1021, (2018)