A review on emotion recognition from dialect speech using feature optimization and classification techniques

被引:1
作者
Thimmaiah, Sunil [1 ]
Vinay, N. A. [3 ]
Ravikumar, M. G. [1 ]
Prasad, S. R. [2 ]
机构
[1] Nagarjuna Coll Engn & Technol, Bengaluru, India
[2] KNS Inst Technol, Bengaluru, India
[3] Dayananda Sagar Coll Engn, Bengaluru, India
关键词
Emotion recognition; Dialect speech; Feature optimization; Classification techniques; Acoustic cues; Spectral features; Prosodic features; Temporal features; Support vector machines; Gaussian mixture models; Hidden Markov models; Machine learning; Convolutional neural networks; Long short-term memory networks; Feature selection; Dimensionality reduction; Principal component analysis; Recursive feature elimination; Datasets; Model generalization;
D O I
10.1007/s11042-024-18297-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Emotion recognition from speech has gained prominence across various domains due to its wide-ranging applications. This paper presents a comprehensive review of advancements in emotion recognition, focusing on dialect speech, through the utilization of feature optimization and classification techniques. Dialectal variations in speech introduce complexities that impact the accuracy of emotion recognition models. To address this challenge, diverse feature extraction methods have been explored, capturing both general and dialect-specific acoustic cues. Spectral, prosodic, and temporal features are adapted and optimized to enhance emotional content representation within dialect speech. Classification techniques play a pivotal role in distinguishing emotions in dialect speech. Traditional classifiers like Support Vector Machines (SVMs), Gaussian Mixture Models (GMMs), and Hidden Markov Models (HMMs) have been employed. Recent studies highlight the efficacy of machine learning approaches such as Random Forests, Gradient Boosting, Convolutional Neural Networks (CNNs), and Long Short-Term Memory networks (LSTMs). Feature selection and dimensionality reduction techniques optimize model performance. Principal Component Analysis (PCA), Recursive Feature Elimination (RFE), and genetic algorithms enhance feature sets, improving classification accuracy and computational efficiency. Datasets tailored for dialect-specific speech corpora address linguistic nuances and contribute to the model's relevance to distinct regions or communities. Challenges include limited labelled dialect emotion datasets, model generalization across multiple dialects, and ethical considerations. As the field evolves, striking a balance between performance and ethics remains imperative. This review underscores the promise of optimized feature extraction, innovative classification techniques, and tailored datasets in dialect-based emotion recognition.
引用
收藏
页码:73793 / 73793
页数:34
相关论文
共 50 条
  • [21] Feature Analysis for Speech Emotion Classification
    Kingeski, R.
    Schueda, L. A. P.
    Paterno, A. S.
    XXVII BRAZILIAN CONGRESS ON BIOMEDICAL ENGINEERING, CBEB 2020, 2022, : 2359 - 2365
  • [22] A novel feature selection method for speech emotion recognition
    Ozseven, Turgut
    APPLIED ACOUSTICS, 2019, 146 : 320 - 326
  • [23] KOLMOGOROV-SMIRNOV TEST FOR FEATURE SELECTION IN EMOTION RECOGNITION FROM SPEECH
    Ivanov, Alexei
    Riccardi, Giuseppe
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5125 - 5128
  • [24] A robust feature selection method based on meta-heuristic optimization for speech emotion recognition
    Kesava Rao Bagadi
    Chandra Mohan Reddy Sivappagari
    Evolutionary Intelligence, 2024, 17 : 993 - 1004
  • [25] A robust feature selection method based on meta-heuristic optimization for speech emotion recognition
    Bagadi, Kesava Rao
    Sivappagari, Chandra Mohan Reddy
    EVOLUTIONARY INTELLIGENCE, 2024, 17 (02) : 993 - 1004
  • [26] Combined Feature Representation for Emotion Classification from Russian Speech
    Verkholyak, Oxana
    Karpov, Alexey
    ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE, 2018, 789 : 68 - 73
  • [27] Emotion Recognition from Speech Signals using Excitation Source and Spectral Features
    Choudhury, Akash Roy
    Ghosh, Anik
    Pandey, Rahul
    Barman, Subhas
    PROCEEDINGS OF 2018 IEEE APPLIED SIGNAL PROCESSING CONFERENCE (ASPCON), 2018, : 257 - 261
  • [28] Speech Emotion Recognition Using Sequential Capsule Networks
    Wu, Xixin
    Cao, Yuewen
    Lu, Hui
    Liu, Songxiang
    Wang, Disong
    Wu, Zhiyong
    Liu, Xunying
    Meng, Helen
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2021, 29 : 3280 - 3291
  • [29] Emotion Recognition in Speech using Multi-Classification SVM
    Zhang, Weishan
    Meng, Xin
    Li, Zhongwei
    Lu, Qinghua
    Tan, Shaochao
    IEEE 12TH INT CONF UBIQUITOUS INTELLIGENCE & COMP/IEEE 12TH INT CONF ADV & TRUSTED COMP/IEEE 15TH INT CONF SCALABLE COMP & COMMUN/IEEE INT CONF CLOUD & BIG DATA COMP/IEEE INT CONF INTERNET PEOPLE AND ASSOCIATED SYMPOSIA/WORKSHOPS, 2015, : 1181 - 1186
  • [30] Speech emotion recognition of Hindi speech using statistical and machine learning techniques
    Agrawal, Akshat
    Jain, Anurag
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2020, 23 (01) : 311 - 319