Emotion recognition from speech signals using digital features optimization by diversity measure fusion

被引:0
|
作者
Konduru, Ashok Kumar [1 ]
Iqbal, J. L. Mazher [2 ]
机构
[1] Veltech Rangarajan Dr Sagunthala R&D Inst Sci & T, Chennai, Tamil Nadu, India
[2] Veltech Rangarajan Dr Sagunthala R&D Inst Sci & T, ECE, Chennai, Tamil Nadu, India
关键词
Hidden markov model; emotion detection; speech signal; artificial intelligence; cuckoo search; distributed diversity measures; FEATURE-SELECTION; ALGORITHM; NETWORKS;
D O I
10.3233/JIFS-231263
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Emotion recognition from speech signals serves a crucial role in human-computer interaction and behavioral studies. The task, however, presents significant challenges due to the high dimensionality and noisy nature of speech data. This article presents a comprehensive study and analysis of a novel approach, "Digital Features Optimization by Diversity Measure Fusion (DFOFDM)", aimed at addressing these challenges. The paper begins by elucidating the necessity for improved emotion recognition methods, followed by a detailed introduction to DFOFDM. This approach employs acoustic and spectral features from speech signals, coupled with an optimized feature selection process using a fusion of diversity measures. The study's central method involves a Cuckoo Search-based classification strategy, which is tailored for this multi-label problem. The performance of the proposed DFOFDM approach is evaluated extensively. Emotion labels such as 'Angry', 'Happy', and 'Neutral' showed a precision rate over 92%, while other emotions fell within the range of 87% to 90%. Similar performance was observed in terms of recall, with most emotions falling within the 90% to 95% range. The F-Score, another crucial metric, also reflected comparable statistics for each label. Notably, the DFOFDM model showed resilience to label imbalances and noise in speech data, crucial for real-world applications. When compared with a contemporary model, "Transfer Subspace Learning by Least Square Loss (TSLSL)", DFOFDM displayed superior results across various evaluation metrics, indicating a promising improvement in the field of speech emotion recognition. In terms of computational complexity, DFOFDM demonstrated effective scalability, providing a feasible solution for large-scale applications. Despite its effectiveness, the study acknowledges the potential limitations of the DFOFDM, which might influence its performance on certain types of real-world data. The findings underline the potential of DFOFDM in advancing emotion recognition techniques, indicating the necessity for further research.
引用
收藏
页码:2547 / 2572
页数:26
相关论文
共 50 条
  • [21] Emotion classification from speech signal based on empirical mode decomposition and non-linear features Speech emotion recognition
    Krishnan, Palani Thanaraj
    Alex Noel, Joseph Raj
    Rajangam, Vijayarajan
    COMPLEX & INTELLIGENT SYSTEMS, 2021, 7 (04) : 1919 - 1934
  • [22] A Feature Fusion Model with Data Augmentation for Speech Emotion Recognition
    Tu, Zhongwen
    Liu, Bin
    Zhao, Wei
    Yan, Raoxin
    Zou, Yang
    APPLIED SCIENCES-BASEL, 2023, 13 (07):
  • [23] A Mobile Emotion Recognition System Based on Speech Signals and Facial Images
    Wu, Yu-Hao
    Lin, Shu-Jing
    Yang, Don-Lin
    2013 INTERNATIONAL COMPUTER SCIENCE AND ENGINEERING CONFERENCE (ICSEC), 2013, : 212 - 217
  • [24] Transformer-Based Multilingual Speech Emotion Recognition Using Data Augmentation and Feature Fusion
    Al-onazi, Badriyya B.
    Nauman, Muhammad Asif
    Jahangir, Rashid
    Malik, Muhmmad Mohsin
    Alkhammash, Eman H.
    Elshewey, Ahmed M.
    APPLIED SCIENCES-BASEL, 2022, 12 (18):
  • [25] Multiple Enhancements to LSTM for Learning Emotion-Salient Features in Speech Emotion Recognition
    Hu, Desheng
    Hu, Xinhui
    Xu, Xinkang
    INTERSPEECH 2022, 2022, : 4720 - 4724
  • [26] Biologically inspired emotion recognition from speech
    Laura Caponetti
    Cosimo Alessandro Buscicchio
    Giovanna Castellano
    EURASIP Journal on Advances in Signal Processing, 2011
  • [27] Biologically inspired emotion recognition from speech
    Caponetti, Laura
    Buscicchio, Cosimo Alessandro
    Castellano, Giovanna
    EURASIP JOURNAL ON ADVANCES IN SIGNAL PROCESSING, 2011,
  • [28] Automatic speech emotion recognition using an optimal combination of features based on EMD-TKEO
    Kerkeni, Leila
    Serrestou, Youssef
    Raoof, Kosai
    Mbarki, Mohamed
    Mahjoub, Mohamed Ali
    Cleder, Catherine
    SPEECH COMMUNICATION, 2019, 114 : 22 - 35
  • [29] Diversity subspace generation based on feature selection for speech emotion recognition
    Qing Ye
    Yaxin Sun
    Multimedia Tools and Applications, 2024, 83 : 23533 - 23561
  • [30] Acoustic feature analysis and optimization for Bangla speech emotion recognition
    Sultana, Sadia
    Rahman, Mohammad Shahidur
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2023, 44 (03) : 157 - 166