Emotion recognition from speech signals using digital features optimization by diversity measure fusion

被引:0
|
作者
Konduru, Ashok Kumar [1 ]
Iqbal, J. L. Mazher [2 ]
机构
[1] Veltech Rangarajan Dr Sagunthala R&D Inst Sci & T, Chennai, Tamil Nadu, India
[2] Veltech Rangarajan Dr Sagunthala R&D Inst Sci & T, ECE, Chennai, Tamil Nadu, India
关键词
Hidden markov model; emotion detection; speech signal; artificial intelligence; cuckoo search; distributed diversity measures; FEATURE-SELECTION; ALGORITHM; NETWORKS;
D O I
10.3233/JIFS-231263
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition from speech signals serves a crucial role in human-computer interaction and behavioral studies. The task, however, presents significant challenges due to the high dimensionality and noisy nature of speech data. This article presents a comprehensive study and analysis of a novel approach, "Digital Features Optimization by Diversity Measure Fusion (DFOFDM)", aimed at addressing these challenges. The paper begins by elucidating the necessity for improved emotion recognition methods, followed by a detailed introduction to DFOFDM. This approach employs acoustic and spectral features from speech signals, coupled with an optimized feature selection process using a fusion of diversity measures. The study's central method involves a Cuckoo Search-based classification strategy, which is tailored for this multi-label problem. The performance of the proposed DFOFDM approach is evaluated extensively. Emotion labels such as 'Angry', 'Happy', and 'Neutral' showed a precision rate over 92%, while other emotions fell within the range of 87% to 90%. Similar performance was observed in terms of recall, with most emotions falling within the 90% to 95% range. The F-Score, another crucial metric, also reflected comparable statistics for each label. Notably, the DFOFDM model showed resilience to label imbalances and noise in speech data, crucial for real-world applications. When compared with a contemporary model, "Transfer Subspace Learning by Least Square Loss (TSLSL)", DFOFDM displayed superior results across various evaluation metrics, indicating a promising improvement in the field of speech emotion recognition. In terms of computational complexity, DFOFDM demonstrated effective scalability, providing a feasible solution for large-scale applications. Despite its effectiveness, the study acknowledges the potential limitations of the DFOFDM, which might influence its performance on certain types of real-world data. The findings underline the potential of DFOFDM in advancing emotion recognition techniques, indicating the necessity for further research.
引用
收藏
页码:2547 / 2572
页数:26
相关论文
共 50 条
  • [31] Diversity subspace generation based on feature selection for speech emotion recognition
    Ye, Qing
    Sun, Yaxin
    MULTIMEDIA TOOLS AND APPLICATIONS, 2023, 83 (8) : 23533 - 23561
  • [32] Fusion of PCA and ICA in Statistical Subset Analysis for Speech Emotion Recognition
    Kingeski, Rafael
    Henning, Elisa
    Paterno, Aleksander S.
    SENSORS, 2024, 24 (17)
  • [33] NOT ALL FEATURES ARE EQUAL: SELECTION OF ROBUST FEATURES FOR SPEECH EMOTION RECOGNITION IN NOISY ENVIRONMENTS
    Leem, Seong-Gyun
    Fulford, Daniel
    Onnela, Jukka-Pekka
    Gard, David
    Busso, Carlos
    2022 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2022, : 6447 - 6451
  • [34] Utilizing Enhanced Particle Swarm Optimization for Feature Selection in Gender-Emotion Detection From English Speech Signals
    Amjad, Ammar
    Tai, Li-Chia
    Chang, Hsien-Tsung
    IEEE ACCESS, 2024, 12 : 189564 - 189573
  • [35] Emotion Recognition using Facial and Audio features
    Krishna, Tarun
    Rai, Ayush
    Bansal, Shubham
    Khandelwal, Shubham
    Gupta, Shubham
    Goyal, Dushyant
    ICMI'13: PROCEEDINGS OF THE 2013 ACM INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION, 2013, : 557 - 562
  • [36] A Perspective Study on Speech Emotion Recognition: Databases, Features and Classification Models
    Raghu, Kogila
    Sadanandam, Manchala
    TRAITEMENT DU SIGNAL, 2021, 38 (06) : 1861 - 1873
  • [37] Speech emotion recognition using hidden Markov models
    Nwe, TL
    Foo, SW
    De Silva, LC
    SPEECH COMMUNICATION, 2003, 41 (04) : 603 - 623
  • [38] Emotion Recognition from Speech using Extended Feature Selection and a Simple Classifier
    Hassan, Ali
    Damper, Robert I.
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2011 - 2014
  • [39] SPEECH EMOTION RECOGNITION USING CYCLOSTATIONARY SPECTRAL ANALYSIS
    Jalili, Amin
    Sahami, Sadid
    Chi, Chong-Yung
    Amirfattahi, Rassoul
    2018 IEEE 28TH INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING (MLSP), 2018,
  • [40] EMOTION RECOGNITION USING SYNTHETIC SPEECH AS NEUTRAL REFERENCE
    Lotfian, Reza
    Busso, Carlos
    2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP), 2015, : 4759 - 4763