Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers

被引：1

作者：

Al Dujaili, Mohammed Jawad ^{[1
]}

Ebrahimi-Moghadam, Abbas ^{[2
]}

机构：

[1] Univ Kufa, Fac Engn, Dept Elect & Commun, Najaf, Iraq

[2] Ferdowsi Univ Mashhad, Elect Engn Dept, Fac Engn, Mashhad, Iran

来源：

MULTIMEDIA TOOLS AND APPLICATIONS | 2023年 / 82卷 / 27期

基金：

英国科研创新办公室;

关键词：

Speech emotion recognition (SER); MFCC; Jitter; Shimmer; PCA; ANN; LDA; K_NN; MODELS;

D O I：

10.1007/s11042-023-15413-x

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Despite many efforts in Speech Emotion Recognition, there is still a big gap between natural human feelings and computer perception. In this article, the recognition of the speaker's emotions in Persian and German has been examined. For this purpose, Persian emotional speech utterances have been expressed, including 748 sentences with seven feelings of Neutral, Disgust, Fear, Anger, Sadness, Boredom and Happiness. German emotional speech utterances consist of 536 sentences created by professional actors in a laboratory environment, 16 of which with seven different feelings of Happiness, hatred, naturalness, fear, Sadness, Anger, and fatigue. After extracting widely used properties such as MFCC Mel Frequency Cepstral Coefficients and its derivatives, local frequency perturbation coefficient (Jitter), and local perturbation coefficient (Shimmer), various features of this database are extracted separately because of the vast number of options. Reducing feature space is required before applying the principal component classification (PCA) algorithm. Also, three classifications of Artificial neural network (ANN), Linear Discriminant Analysis (LDA), and K_Nearest Neighbor (K_NN) have been used to classify emotions. For the German database, the top results were obtained by fusing the MFCC + Shimmer properties and LDA classification with a precision detection of 91.26% and a runtime execution of 0.43 s, and the best results for the Persian database were obtained by fusing the Jitter + Shimmer properties and K_NN classification with a precision detection of 91.5% and a runtime execution of 0.65 s. The results show that the ability to distinguish attribute vectors is quite different for each emotional state. Expression of emotions and their effect on speech differ in Persian and German.

引用

页码：42783 / 42801

页数：19

共 33 条

[1] Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers
Mohammed Jawad Al Dujaili
Abbas Ebrahimi-Moghadam
Multimedia Tools and Applications, 2023, 82 : 42783 - 42801
[2] MFCC based Recognition of Repetitions and Prolongations in Stuttered Speech using k-NN and LDA
Chee, Lim Sin
Ai, Ooi Chia
Hariharan, M.
Yaacob, Sazali
2009 IEEE STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT: SCORED 2009, PROCEEDINGS, 2009, : 146 - 149
[3] Speech emotion recognition based on a hybrid of HMM/ANN
Mao, Xia
Zhang, Bing
Luo, Yi
PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON APPLIED INFORMATICS AND COMMUNICATIONS, 2007, : 369 - 372
[4] A New Object Recognition Framework based on PCA, LDA, and K-NN
Hagar, Asmaa A. M.
Alshewimy, Mahmoud A. M.
Saidahmed, Mohamed T. Fahccm
PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 141 - 146
[5] A Hybrid HMM/ANN Approach for Automatic Gujarati Speech Recognition
Valaki, Sanjay
Jethva, Harikrishna
2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
[6] A Novel S-LDA Features for Automatic Emotion Recognition from Speech using 1-D CNN
Tiwari, Pradeep
Darji, A. D.
INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2022, 7 (01) : 49 - 67
[7] ANN based Decision Fusion for Speech Emotion Recognition
Xu, Lu
Xu, Mingxing
Yang, Dali
INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2003 - +
[8] Speech Emotion Recognition Based on Minimal Voice Quality Features
Jacob, Agnes
2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, 2016, : 886 - 890
[9] Deep Learning Algorithms for Speech Emotion Recognition with Hybrid Spectral Features
Kogila R.
Sadanandam M.
Bhukya H.
SN Computer Science, 5 (1)
[10] Database development and automatic speech recognition of isolated Pashto spoken digits using MFCC and K-NN
Ali, Zakir
Abbas, Arbab
Thasleema, T.
Uddin, Burhan
Raaz, Tanzeela
Abid, Sahibzada
INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (02) : 271 - 275

← 1 2 3 4 →