Empirical Comparison between Deep and Classical Classifiers for Speaker Verification in Emotional Talking Environments

被引:2
|
作者
Nassif, Ali Bou [1 ]
Shahin, Ismail [2 ]
Lataifeh, Mohammed [3 ]
Elnagar, Ashraf [3 ]
Nemmour, Nawel [1 ]
机构
[1] Univ Sharjah, Comp Engn Dept, Sharjah 27272, U Arab Emirates
[2] Univ Sharjah, Elect Engn Dept, Sharjah 27272, U Arab Emirates
[3] Univ Sharjah, Comp Sci Dept, Sharjah 27272, U Arab Emirates
关键词
classical classifiers; deep neural network; emotional speech; feature extraction; speaker verification;
D O I
10.3390/info13100456
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech signals carry various bits of information relevant to the speaker such as age, gender, accent, language, health, and emotions. Emotions are conveyed through modulations of facial and vocal expressions. This paper conducts an empirical comparison of performances between the classical classifiers: Gaussian Mixture Model (GMM), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Artificial neural networks (ANN); and the deep learning classifiers, i.e., Long Short-Term Memory (LSTM), Convolutional Neural Network (CNN), and Gated Recurrent Unit (GRU) in addition to the ivector approach for a text-independent speaker verification task in neutral and emotional talking environments. The deep models undergo hyperparameter tuning using the Grid Search optimization algorithm. The models are trained and tested using a private Arabic Emirati Speech Database, Ryerson Audio-Visual Database of Emotional Speech and Song dataset (RAVDESS) database, and a public Crowd-Sourced Emotional Multimodal Actors (CREMA) database. Experimental results illustrate that deep architectures do not necessarily outperform classical classifiers. In fact, evaluation was carried out through Equal Error Rate (EER) along with Area Under the Curve (AUC) scores. The findings reveal that the GMM model yields the lowest EER values and the best AUC scores across all datasets, amongst classical classifiers. In addition, the ivector model surpasses all the fine-tuned deep models (CNN, LSTM, and GRU) based on both evaluation metrics in the neutral, as well as the emotional speech. In addition, the GMM outperforms the ivector using the Emirati and RAVDESS databases.
引用
收藏
页数:23
相关论文
共 42 条
  • [21] Calibration of Deep Medical Image Classifiers: An Empirical Comparison Using Dermatology and Histopathology Datasets
    Carse, Jacob
    Olmo, Andres Alvarez
    McKenna, Stephen
    UNCERTAINTY FOR SAFE UTILIZATION OF MACHINE LEARNING IN MEDICAL IMAGING, 2022, 13563 : 89 - 99
  • [22] On the Complexity of Neural Network Classifiers: A Comparison Between Shallow and Deep Architectures
    Bianchini, Monica
    Scarselli, Franco
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (08) : 1553 - 1565
  • [23] Comparison between Multi-Class Classifiers and Deep Learning with Focus on Industry 4.0
    Miskuf, Martin
    Zolotova, Iveta
    2016 CYBERNETICS & INFORMATICS (K&I), 2016,
  • [24] Reliability Analysis in Series Systems: An Empirical Comparison Between Bayesian and Classical Estimators
    Rodrigues, Agatha S.
    Dias, Teresa Cristina M.
    Lauretto, Marcelo S.
    Polpo, Adriano
    BAYESIAN INFERENCE AND MAXIMUM ENTROPY METHODS IN SCIENCE AND ENGINEERING, 2012, 1443 : 214 - 221
  • [25] Comparison between supervised and unsupervised learning of probabilistic linear discriminant analysis mixture models for speaker verification
    Pekhovsky, Timur
    Sizov, Aleksandr
    PATTERN RECOGNITION LETTERS, 2013, 34 (11) : 1307 - 1313
  • [26] Design of Deep Supported Excavations: Comparison Between Numerical and Empirical Methods
    Katsigiannis, Georgios
    Schweiger, Helmut F.
    Ferreira, Pedro
    Fuentes, Raul
    GEOTECHNICAL SAFETY AND RISK V, 2015, : 482 - 488
  • [27] Automating Landslips Segmentation for Damage Assessment: A Comparison Between Deep Learning and Classical Models
    Ciccone, Francesco
    Ceruti, Alessandro
    Bacciaglia, Antonio
    Meisina, Claudia
    DESIGN TOOLS AND METHODS IN INDUSTRIAL ENGINEERING III, VOL 2, ADM 2023, 2024, : 91 - 99
  • [28] Clinical assessment of suspected deep vein thrombosis: comparison between a score and empirical assessment
    Miron, MJ
    Perrier, A
    Bounameaux, H
    JOURNAL OF INTERNAL MEDICINE, 2000, 247 (02) : 249 - 254
  • [29] Bone scintigraphy classification: a comparison between machine learning and deep learning classifiers using imaging data only
    Silva, M.
    Oliveira, F.
    Castanheira, J.
    Silva, A.
    Vieira, L.
    Costa, D.
    EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING, 2021, 48 (SUPPL 1) : S348 - S348
  • [30] How to create emotional and/or altered voice quality. An acoustic comparison between habitual and altered voice of the same speaker.
    Wolf, Jerzy
    2014 XXII ANNUAL PACIFIC VOICE CONFERENCE (PVC), 2014,