Deep ensemble models for speech emotion classification

被引:4
作者
Pravin, Sheena Christabel [1 ]
Sivaraman, Vishal Balaji [2 ]
Saranya, J. [3 ]
机构
[1] Vellore Inst Technol, Sch Elect Engn SENSE, Chennai, India
[2] Univ Florida, Gainesville, FL USA
[3] Rajalakshmi Engn Coll, Thandalam, India
关键词
Deep cascaded ensemble; Deep parallel ensemble; Speech emotion classification; Memory consumption and run time complexity; RECOGNITION;
D O I
10.1016/j.micpro.2023.104790
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
This research article proposes two deep ensemble models, namely the Deep Cascaded Ensemble (DCE) and Deep Parallel Ensemble (DPE) for automatic speech emotion classification. Classification of emotions into their respective classes has so long relied on machine learning and deep learning networks. The proposed models are a blend of different machine learning and deep learning models in a cascaded/parallel architecture. The proposed models have exhibited a considerable reduction in the consumption of memory in a Google Colab environment. Furthermore, the issues of tuning numerous hyper-parameters and the huge data demand of the deep learning algorithms are overcome by the proposed deep ensemble models. The proposed DCE and DPE have yielded optimal classification accuracy with reduced consumption of memory over less data. Experimentally, the pro-posed deep cascaded ensemble has superior performance compared to the deep parallel ensemble of the same combination of deep learning and machine learning networks as well. The proposed models and the baseline models were evaluated in terms of possible performance metrics including Cohen's kappa coefficient, accuracy of classification accuracy, space and time complexity.
引用
收藏
页数:9
相关论文
共 50 条
  • [31] Evaluation of Speech Emotion Classification Based on GMM and Data Fusion
    Vondra, Martin
    Vich, Robert
    CROSS-MODAL ANALYSIS OF SPEECH, GESTURES, GAZE AND FACIAL EXPRESSIONS, 2009, 5641 : 98 - 105
  • [32] A Study on Speech Emotion Recognition Using a Deep Neural Network
    Lee, Kyong Hee
    Choi, Hyun Kyun
    Jang, Byung Tae
    Kim, Do Hyun
    2019 10TH INTERNATIONAL CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGY CONVERGENCE (ICTC): ICT CONVERGENCE LEADING THE AUTONOMOUS FUTURE, 2019, : 1162 - 1165
  • [33] Enhanced speech emotion detection using deep neural networks
    Lalitha, S.
    Tripathi, Shikha
    Gupta, Deepa
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2019, 22 (03) : 497 - 510
  • [34] Ensemble Deep Learning for Biomedical Time Series Classification
    Jin, Lin-peng
    Dong, Jun
    COMPUTATIONAL INTELLIGENCE AND NEUROSCIENCE, 2016, 2016
  • [35] Fusion-based speech emotion classification using two-stage feature selection
    Xie, Jie
    Zhu, Mingying
    Hu, Kai
    SPEECH COMMUNICATION, 2023, 152
  • [36] Classification of Speech Emotion State Based on Feature Map Fusion of TCN and Pretrained CNN Model From Korean Speech Emotion Data
    Jo, A-Hyeon
    Kwak, Keun-Chang
    IEEE ACCESS, 2025, 13 : 19947 - 19963
  • [37] Speech emotion classification using fractal dimension-based features
    Tamulevicius, Gintautas
    Karbauskaite, Rasa
    Dzemyda, Gintautas
    NONLINEAR ANALYSIS-MODELLING AND CONTROL, 2019, 24 (05): : 679 - 695
  • [38] Introducing New Feature Set based on Wavelets for Speech Emotion Classification
    Tanmoy, Roy
    Tshilidzi, Marwala
    Snehashish, Chakraverty
    Paul, Satyakama
    PROCEEDINGS OF 2018 IEEE APPLIED SIGNAL PROCESSING CONFERENCE (ASPCON), 2018, : 124 - 128
  • [39] Emotion, age, and gender classification in children's speech by humans and machines
    Kaya, Heysem
    Salah, Albert Ali
    Karpovc, Alexey
    Frolova, Olga
    Grigorev, Aleksey
    Lyakso, Elena
    COMPUTER SPEECH AND LANGUAGE, 2017, 46 : 268 - 283
  • [40] Handling high dimensional features by ensemble learning for emotion identification from speech signal
    Ashok Kumar, Konduru
    Iqbal, J. L. Mazher
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2021, 25 (4) : 837 - 851