Hilbert Domain Analysis of Wavelet Packets for Emotional Speech Classification

被引:0
作者
Biswajit Karan
Arvind Kumar
机构
[1] Aditya Engineering College(A),Department of Electronics and Communication Engineering
[2] University of Stellenbosch,Department of Electrical and Electronic Engineering
[3] Gitam University,Department of Electronics, Electrical and Communication Engineering
来源
Circuits, Systems, and Signal Processing | 2024年 / 43卷
关键词
Emotional; Wavelet packet; Hilbert transformation; SVM; NN; ECOC;
D O I
暂无
中图分类号
学科分类号
摘要
This work investigates the significance of Hilbert domain characterization of wavelet packets in classifying different emotion of speech signal. The goal of this paper is to create a new emotional speech database and introduce a new feature extraction approach that can recognize various emotions. The proposed feature, wavelet cepstral coefficients (WCC) are based on Hilbert spectrum analysis of the wavelet packet of the speech signal. The speaker-independent machine learning models are developed using multiclass support vector machine (SVM) and k-nearest neighbourhood (KNN) classifier. The approach is tested with newly developed Telugu Indian database and the EMOVO (Italian emotional speech) database. Our proposed wavelet features achieve a peak accuracy of 73.5%, further boosted by NCA feature selection by 3–5%, resulting in an improved unweighted average recall (UAR) of 78% for database 1 and 87.50% for database 2, employing optimal wavelet features in conjunction with SVM classification. The proposed features outperformed the baseline Mel-frequency cepstral coefficients (MFCC) feature. The performance of newly formulated features is better than other existing methodologies tested with different language databases.
引用
收藏
页码:2224 / 2250
页数:26
相关论文
共 160 条
  • [1] Abbaschian BJ(2021)Deep learning techniques for speech emotion recognition, from databases to models Sensors 21 1249-22
  • [2] Sierra-Sosa D(2021)Improved speech emotion recognition with Mel frequency magnitude coefficient Appl. Acoust. 179 15-9
  • [3] Elmaghraby A(2020)Speaker awareness for speech emotion recognition Int. J. Online Biomed. Eng. 16 6-80
  • [4] Ancilin J(2010)Speech emotion recognition using support vector machine Int. J. Comput. Appl. 1 45-815
  • [5] Milton A(2008)Automatic detection of learner’s affect from conversational cues User Model. User-Adapt. Interact. 18 802-5
  • [6] Assunção G(2018)Multiscale amplitude feature and significance of enhanced vocal tract information for emotion classification IEEE Trans. Cybernet. 49 1-4309
  • [7] Menezes P(2013)Emotion modeling from speech signal based on wavelet packet transform Int. J. Speech Technol. 16 4299-122
  • [8] Perdigão F(2016)Exploitation of phase-based features for whispered speech emotion recognition IEEE Access 4 113-1920
  • [9] Chavhan Y(2016)Speech emotion recognition using deep learning techniques ABC J. Adv. Res. 5 1916-20
  • [10] Dhore ML(2013)Analysis of emotional speech at subsegmental level Interspeech 2013 12-127