Deep Learning Framework for Speech Emotion Classification: A Survey of the State-of-the-Art

被引:1
作者
Akinpelu, Samson [1 ]
Viriri, Serestina [1 ]
机构
[1] Univ KwaZulu Natal, Sch Math Stat & Comp Sci, ZA-4041 Durban, South Africa
来源
IEEE ACCESS | 2024年 / 12卷
关键词
Deep learning; Feature extraction; Convolutional neural networks; Accuracy; Surveys; Reviews; Neurons; Hidden Markov models; Computer architecture; Emotion recognition; Human computer interaction; Speech recognition; Human-computer interaction; deep learning; speech emotion recognition; convolutional neural networks; vision transformer; mel spectrogram; RECOGNITION;
D O I
10.1109/ACCESS.2024.3474553
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The intricate landscape of speech emotion classification poses a captivating yet challenging realm due to emotions being fundamental to human communication. In recent years, deep learning frameworks have emerged as powerful tools, shedding light on the elusive domain of emotion recognition, revolutionizing human-computer interactions, and enhancing the emotional intelligence of artificial intelligence (AI). This survey embarks on an exploratory journey into the forefront of deep learning approaches dedicated to speech emotion classification. Deep learning has become the standard approach due to the scarcity of extensive speech corpora and the need for high accuracy at low computational cost. The reason lies in its potency to extract important emotional features from large or medium-sized spectrogram images. Deep learning has been applied to speech emotion classification by many researchers, leading to significant improvements in performance and accuracy. Modern deep learning methods designed for human auditory speech emotion classification are carefully examined in this work. A thorough examination of various deep learning framework designs used in emotion classification is provided, illuminating unique characteristics that capture essential features from speech signals for accurate emotion prediction. The research critically analyzes selected deep models using well-established emotion corpora, highlighting their effectiveness. This research analyses typical performance evaluation metrics used to evaluate speech emotion classification models. With this review, we hope to offer a comprehensive overview of the state-of-the-art, potential directions for further investigation, and developing approaches that further the field of speech emotion classification with deep learning frameworks.
引用
收藏
页码:152152 / 152182
页数:31
相关论文
共 50 条
  • [41] A Comprehensive Evaluation of State-of-the-Art Deep Learning Models for Road Surface Type Classification
    Hnoohom, Narit
    Mekruksavanich, Sakorn
    Jitpattanakul, Anuchit
    INTELLIGENT AUTOMATION AND SOFT COMPUTING, 2023, 37 (02) : 1275 - 1291
  • [42] Enhancing multimodal disaster tweet classification using state-of-the-art deep learning networks
    Divakaran Adwaith
    Ashok Kumar Abishake
    Siva Venkatesh Raghul
    Elango Sivasankar
    Multimedia Tools and Applications, 2022, 81 : 18483 - 18501
  • [43] Palm Oil Counter: State-of-the-Art Deep Learning Models for Detection and Counting in Plantations
    Naftali, Martinus Grady
    Hugo, Gregory
    Suharjito
    IEEE ACCESS, 2024, 12 : 90395 - 90417
  • [44] Enhancing multimodal disaster tweet classification using state-of-the-art deep learning networks
    Adwaith, Divakaran
    Abishake, Ashok Kumar
    Raghul, Siva Venkatesh
    Sivasankar, Elango
    MULTIMEDIA TOOLS AND APPLICATIONS, 2022, 81 (13) : 18483 - 18501
  • [45] DL4SciVis: A State-of-the-Art Survey on Deep Learning for Scientific Visualization
    Wang, Chaoli
    Han, Jun
    IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, 2023, 29 (08) : 3714 - 3733
  • [46] XcelNet14: A Novel Deep Learning Framework for Aerial Scene Classification
    Ahmed, Bilal
    Akram, Tallha
    Naqvi, Syed Rameez
    Alsuhaibani, Anas
    Attique Khan, Muhammad
    Kraiem, Naoufel
    IEEE ACCESS, 2024, 12 : 196266 - 196281
  • [47] A Context-Supported Deep Learning Framework for Multimodal Brain Imaging Classification
    Jiang, Jianmin
    Fares, Ahmed
    Zhong, Sheng-Hua
    IEEE TRANSACTIONS ON HUMAN-MACHINE SYSTEMS, 2019, 49 (06) : 611 - 622
  • [48] A Deep Learning Framework for Malware Classification
    Kalash, Mahmoud
    Rochan, Mrigank
    Mohammed, Noman
    Bruce, Neil
    Wang, Yang
    Iqbal, Farkhund
    INTERNATIONAL JOURNAL OF DIGITAL CRIME AND FORENSICS, 2020, 12 (01) : 90 - 108
  • [49] TOWARD ROBUST SPEECH EMOTION RECOGNITION AND CLASSIFICATION USING NATURAL LANGUAGE PROCESSING WITH DEEP LEARNING MODEL
    Alahmari, Saad
    Al-shathry, Najla i.
    Eltahir, Majdy m.
    Alzaidi, Muhammad swaileh a.
    Alghamdi, Ayman ahmad
    Mahmud, Ahmed
    FRACTALS-COMPLEX GEOMETRY PATTERNS AND SCALING IN NATURE AND SOCIETY, 2025,
  • [50] Speech Emotion Detection using IoT based Deep Learning for Health Care
    Tariq, Zeenat
    Shah, Sayed Khushal
    Lee, Yugyung
    2019 IEEE INTERNATIONAL CONFERENCE ON BIG DATA (BIG DATA), 2019, : 4191 - 4196