Machine learning techniques for speech emotion recognition using paralinguistic acoustic features

被引:11
|
作者
Jha T. [1 ]
Kavya R. [1 ]
Christopher J. [1 ]
Arunachalam V. [2 ]
机构
[1] Department of Computer Science and Information Systems, BITS Pilani Hyderabad Campus, Telangana, Hyderabad
[2] Department of Civil Engineering, BITS Pilani Hyderabad Campus, Telangana, Hyderabad
来源
关键词
Affective computing; Emotion recognition; Multilayer perceptron; Paralinguistic acoustic features; Support vector machine;
D O I
10.1007/s10772-022-09985-6
中图分类号
学科分类号
摘要
Speech emotion recognition is one of the fastest growing areas of interest in the field of affective computing. Emotion detection aids human–computer interaction and finds application in a wide gamut of sectors, ranging from healthcare to retail to education. The present work strives to provide a speech emotion recognition framework that is both reliable and efficient enough to work in real-time environments. Speech emotion recognition can be performed using linguistic as well as paralinguistic aspects of speech; this work focusses on the latter, using non-lexical or paralinguistic attributes of speech like pitch, intensity and mel-frequency cepstral coefficients to train supervised machine learning models for emotion recognition. A combination of prosodic and spectral features is used for experimental analysis and classification is performed using algorithms like Gaussian Naïve Bayes, Random Forest, k-Nearest Neighbours, Support Vector Machine and Multilayer Perceptron. The choice of these ML models was based on the swiftness with which they could be trained, making them more suitable for real-time applications. Comparative analysis of the models reveals SVM and MLP to be the best performers with 77.86% and 79.62% accuracies respectively. The performance of these classifiers is compared with benchmark results in literature, and a significant improvement over state-of-the-art models is presented. The observations and findings of this work can be applied to design real-time emotion recognition frameworks that can be used to design and develop applications and technologies for various domains. © 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
引用
收藏
页码:707 / 725
页数:18
相关论文
共 50 条
  • [1] Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques
    Liu, Tong
    Yuan, Xiaochen
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2023, 2023 (01)
  • [2] Paralinguistic and spectral feature extraction for speech emotion classification using machine learning techniques
    Tong Liu
    Xiaochen Yuan
    EURASIP Journal on Audio, Speech, and Music Processing, 2023
  • [3] Speech emotion recognition of Hindi speech using statistical and machine learning techniques
    Agrawal, Akshat
    Jain, Anurag
    JOURNAL OF INTERDISCIPLINARY MATHEMATICS, 2020, 23 (01) : 311 - 319
  • [4] Recognizing Speech Emotion Based on Acoustic Features Using Machine Learning
    Nasim, Md Abu Saleh
    Chowdory, Md Rakibul Hassan
    Dey, Ashim
    Das, Annesha
    13TH INTERNATIONAL CONFERENCE ON ADVANCED COMPUTER SCIENCE AND INFORMATION SYSTEMS (ICACSIS 2021), 2021, : 95 - +
  • [5] Comparison of machine learning algorithms and acoustic features in emotion recognition from spontaneous speech
    Iizuka, Takahisa
    Mori, Hiroki
    ACOUSTICAL SCIENCE AND TECHNOLOGY, 2022, 43 (04) : 228 - 231
  • [6] Applying Machine Learning Techniques for Speech Emotion Recognition
    Tarunika, K.
    Pradeeba, R. B.
    Aruna, P.
    2018 9TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND NETWORKING TECHNOLOGIES (ICCCNT), 2018,
  • [7] Urdu Speech Emotion Recognition using Speech Spectral Features and Deep Learning Techniques
    Taj, Soonh
    Shaikh, Ghulam Mujtaba
    Hassan, Saif
    Nimra
    2023 4th International Conference on Computing, Mathematics and Engineering Technologies: Sustainable Technologies for Socio-Economic Development, iCoMET 2023, 2023,
  • [8] Ensemble Learning of Hybrid Acoustic Features for Speech Emotion Recognition
    Zvarevashe, Kudakwashe
    Olugbara, Oludayo
    ALGORITHMS, 2020, 13 (03)
  • [9] Speech Emotion Recognition Integrating Paralinguistic Features and Auto-encoders in a Deep Learning Model
    Fonnegra, Ruben D.
    Diaz, Gloria M.
    HUMAN-COMPUTER INTERACTION: THEORIES, METHODS, AND HUMAN ISSUES, HCI INTERNATIONAL 2018, PT I, 2018, 10901 : 385 - 396
  • [10] A Subset of Acoustic Features for Machine Learning-based and Statistical Approaches in Speech Emotion Recognition
    Costantini, Giovanni
    Cesarini, Valerio
    Casali, Daniele
    BIOSIGNALS: PROCEEDINGS OF THE 15TH INTERNATIONAL JOINT CONFERENCE ON BIOMEDICAL ENGINEERING SYSTEMS AND TECHNOLOGIES - VOL 4: BIOSIGNALS, 2022, : 257 - 264