Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers

被引:1
|
作者
Al Dujaili, Mohammed Jawad [1 ]
Ebrahimi-Moghadam, Abbas [2 ]
机构
[1] Univ Kufa, Fac Engn, Dept Elect & Commun, Najaf, Iraq
[2] Ferdowsi Univ Mashhad, Elect Engn Dept, Fac Engn, Mashhad, Iran
基金
英国科研创新办公室;
关键词
Speech emotion recognition (SER); MFCC; Jitter; Shimmer; PCA; ANN; LDA; K_NN; MODELS;
D O I
10.1007/s11042-023-15413-x
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Despite many efforts in Speech Emotion Recognition, there is still a big gap between natural human feelings and computer perception. In this article, the recognition of the speaker's emotions in Persian and German has been examined. For this purpose, Persian emotional speech utterances have been expressed, including 748 sentences with seven feelings of Neutral, Disgust, Fear, Anger, Sadness, Boredom and Happiness. German emotional speech utterances consist of 536 sentences created by professional actors in a laboratory environment, 16 of which with seven different feelings of Happiness, hatred, naturalness, fear, Sadness, Anger, and fatigue. After extracting widely used properties such as MFCC Mel Frequency Cepstral Coefficients and its derivatives, local frequency perturbation coefficient (Jitter), and local perturbation coefficient (Shimmer), various features of this database are extracted separately because of the vast number of options. Reducing feature space is required before applying the principal component classification (PCA) algorithm. Also, three classifications of Artificial neural network (ANN), Linear Discriminant Analysis (LDA), and K_Nearest Neighbor (K_NN) have been used to classify emotions. For the German database, the top results were obtained by fusing the MFCC + Shimmer properties and LDA classification with a precision detection of 91.26% and a runtime execution of 0.43 s, and the best results for the Persian database were obtained by fusing the Jitter + Shimmer properties and K_NN classification with a precision detection of 91.5% and a runtime execution of 0.65 s. The results show that the ability to distinguish attribute vectors is quite different for each emotional state. Expression of emotions and their effect on speech differ in Persian and German.
引用
收藏
页码:42783 / 42801
页数:19
相关论文
共 33 条
  • [1] Automatic speech emotion recognition based on hybrid features with ANN, LDA and K_NN classifiers
    Mohammed Jawad Al Dujaili
    Abbas Ebrahimi-Moghadam
    Multimedia Tools and Applications, 2023, 82 : 42783 - 42801
  • [2] MFCC based Recognition of Repetitions and Prolongations in Stuttered Speech using k-NN and LDA
    Chee, Lim Sin
    Ai, Ooi Chia
    Hariharan, M.
    Yaacob, Sazali
    2009 IEEE STUDENT CONFERENCE ON RESEARCH AND DEVELOPMENT: SCORED 2009, PROCEEDINGS, 2009, : 146 - 149
  • [3] Speech emotion recognition based on a hybrid of HMM/ANN
    Mao, Xia
    Zhang, Bing
    Luo, Yi
    PROCEEDINGS OF THE 7TH WSEAS INTERNATIONAL CONFERENCE ON APPLIED INFORMATICS AND COMMUNICATIONS, 2007, : 369 - 372
  • [4] A New Object Recognition Framework based on PCA, LDA, and K-NN
    Hagar, Asmaa A. M.
    Alshewimy, Mahmoud A. M.
    Saidahmed, Mohamed T. Fahccm
    PROCEEDINGS OF 2016 11TH INTERNATIONAL CONFERENCE ON COMPUTER ENGINEERING & SYSTEMS (ICCES), 2016, : 141 - 146
  • [5] A Hybrid HMM/ANN Approach for Automatic Gujarati Speech Recognition
    Valaki, Sanjay
    Jethva, Harikrishna
    2017 INTERNATIONAL CONFERENCE ON INNOVATIONS IN INFORMATION, EMBEDDED AND COMMUNICATION SYSTEMS (ICIIECS), 2017,
  • [6] A Novel S-LDA Features for Automatic Emotion Recognition from Speech using 1-D CNN
    Tiwari, Pradeep
    Darji, A. D.
    INTERNATIONAL JOURNAL OF MATHEMATICAL ENGINEERING AND MANAGEMENT SCIENCES, 2022, 7 (01) : 49 - 67
  • [7] ANN based Decision Fusion for Speech Emotion Recognition
    Xu, Lu
    Xu, Mingxing
    Yang, Dali
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2003 - +
  • [8] Speech Emotion Recognition Based on Minimal Voice Quality Features
    Jacob, Agnes
    2016 INTERNATIONAL CONFERENCE ON COMMUNICATION AND SIGNAL PROCESSING (ICCSP), VOL. 1, 2016, : 886 - 890
  • [9] Deep Learning Algorithms for Speech Emotion Recognition with Hybrid Spectral Features
    Kogila R.
    Sadanandam M.
    Bhukya H.
    SN Computer Science, 5 (1)
  • [10] Database development and automatic speech recognition of isolated Pashto spoken digits using MFCC and K-NN
    Ali, Zakir
    Abbas, Arbab
    Thasleema, T.
    Uddin, Burhan
    Raaz, Tanzeela
    Abid, Sahibzada
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2015, 18 (02) : 271 - 275