Analysis of Speaker Recognition in Blended Emotional Environment Using Deep Learning Approaches

被引:1
作者
Tomar, Shalini [1 ]
Koolagudi, Shashidhar G. [1 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Mangalore, Karnataka, India
来源
PATTERN RECOGNITION AND MACHINE INTELLIGENCE, PREMI 2023 | 2023年 / 14301卷
关键词
Blended emotion; Mel Frequency Cepstral Coefficients; Convolutional Neural Network; Speaker Recognition; Speaker Recognition in Blended Emotion Environment; Valence;
D O I
10.1007/978-3-031-45170-6_72
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Generally, human conversation has some emotion, and natural emotions are often blended. Today's Speaker Recognition systems lack the component of emotion. This work proposes a Speaker Recognition approaches in Blended Emotion Environment (SRBEE) system to enhance Speaker Recognition (SR) in an emotional context. Speaker Recognition algorithms nearly always achieve perfect performance in the case of neutral speech, but it is not true from an emotional perspective. This work attempts the recognition of speakers in blended emotion with the Mel-Frequency Cepstral Coefficients (MFCC) feature extraction using the Conv2D classifier. In the blended emotional environment, calculating the accuracy of the Speaker Recognition task is complex. The blend of four basic natural emotions (happy, sad, angry, and fearful) utterances tested in the proposed system to reduce SR's complexity in a blended emotional environment. The proposed system achieves an average accuracy of 99.3% for blended emotion with neutral speech and 92.8% for four basic blended natural emotions (happy, sad, angry, and fearful). The dataset was prepared by blending two emotions in one utterance.
引用
收藏
页码:691 / 698
页数:8
相关论文
共 16 条
  • [1] Ghiurcau M.V., 2011, PROC SIGNAL PROCESS
  • [2] Koolagudi S.G., 2011, 2011 INT C DEVICES C, P1, DOI 10.1109/ICDECOM.2011.5738540
  • [3] Koolagudi S. G., 2012, P CUBE INT INFORM TE
  • [4] Koolagudi SG, 2012, COMM COM INF SC, V305, P117
  • [5] Understanding mixed emotions: paradigms and measures
    Kreibig, Sylvia D.
    Gross, James J.
    [J]. CURRENT OPINION IN BEHAVIORAL SCIENCES, 2017, 15 : 62 - 71
  • [6] The Case for Mixed Emotions
    Larsen, Jeff T.
    McGraw, A. Peter
    [J]. SOCIAL AND PERSONALITY PSYCHOLOGY COMPASS, 2014, 8 (06): : 263 - 274
  • [7] Exploring the distribution of statistical feature parameters for natural sound textures
    Mishra, Ambika P.
    Harper, Nicol S.
    Schnupp, Jan W. H.
    [J]. PLOS ONE, 2021, 16 (06):
  • [8] Nakagawa S, 2007, INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, P1065
  • [9] Emotional speaker identification using a novel capsule nets model
    Nassif, Ali Bou
    Shahin, Ismail
    Elnagar, Ashraf
    Velayudhan, Divya
    Alhudhaif, Adi
    Polat, Kemal
    [J]. EXPERT SYSTEMS WITH APPLICATIONS, 2022, 193
  • [10] Parthasarathy S, 2017, INT CONF AFFECT, P434, DOI 10.1109/ACII.2017.8273636