Speaker to Emotion: Domain Adaptation for Speech Emotion Recognition with Residual Adapters

被引:0
|
作者
Xi, Yuxuan [1 ]
Li, Pengcheng [1 ]
Song, Yan [1 ]
Jiang, Yiheng [1 ]
Dai, Lirong [1 ]
机构
[1] Univ Sci & Technol China, Natl Engn Lab Speech & Language Informat Proc, Hefei, Peoples R China
基金
中国国家自然科学基金;
关键词
D O I
10.1109/apsipaasc47483.2019.9023339
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
Despite considerable recent progress in deep learning methods for speech emotion recognition (SER), performance is severely restricted by the lack of large-scale labeled speech emotion corpora. For instance, it is difficult to employ complex neural network architectures such as ResNet, which accompanied by large-sale corpora like VoxCeleb and NIST SRE, have proven to perform well for the related speaker verification (SV) task. In this paper, a novel domain adaptation method is proposed for the speech emotion recognition (SER) task, which aims to transfer related information from a speaker corpus to an emotion corpus. Specifically, a residual adapter architecture is designed for the SER task where ResNet acts as a universal model for general information extraction. An adapter module then trains limited additional parameters to focus on modeling deviation for the specific SER task. To evaluate the effectiveness of the proposed method, we conduct extensive evaluations on benchmark IEMOCAP and CHEAVD 2.0 corpora. Results show significant improvement, with overall results in each task outperforming or matching state-of-the-art methods.
引用
收藏
页码:513 / 518
页数:6
相关论文
共 50 条
  • [11] Universum Autoencoder-Based Domain Adaptation for Speech Emotion Recognition
    Deng, Jun
    Xu, Xinzhou
    Zhang, Zixing
    Fruhholz, Sascha
    Schuller, Bjorn
    IEEE SIGNAL PROCESSING LETTERS, 2017, 24 (04) : 500 - 504
  • [12] COUPLED UNSUPERVISED DEEP CONVOLUTIONAL DOMAIN ADAPTATION FOR SPEECH EMOTION RECOGNITION
    Noi, Ocquaye Elias Nii
    Mao, Qirong
    Xu, Guopeng
    Xue, Yanfei
    2018 IEEE FOURTH INTERNATIONAL CONFERENCE ON MULTIMEDIA BIG DATA (BIGMM), 2018,
  • [13] Autoencoder-based Unsupervised Domain Adaptation for Speech Emotion Recognition
    Deng, Jun
    Zhang, Zixing
    Eyben, Florian
    Schuller, Bjoern
    IEEE SIGNAL PROCESSING LETTERS, 2014, 21 (09) : 1068 - 1072
  • [14] Integrating Emotion Recognition with Speech Recognition and Speaker Diarisation for Conversations
    Wu, Wen
    Zhang, Chao
    Woodland, Philip C.
    INTERSPEECH 2023, 2023, : 3607 - 3611
  • [15] Emotion Attribute Projection for Speaker Recognition on Emotional Speech
    Bao, Huanjun
    Xu, Mingxing
    Zheng, Thomas Fang
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 601 - 604
  • [16] Speaker independent speech emotion recognition by ensemble classification
    Schuller, B
    Reiter, S
    Müller, R
    Al-Hames, M
    Lang, M
    Rigoll, G
    2005 IEEE INTERNATIONAL CONFERENCE ON MULTIMEDIA AND EXPO (ICME), VOLS 1 AND 2, 2005, : 865 - 868
  • [17] Compensating for speaker or lexical variabilities in speech for emotion recognition
    Mariooryad, Soroosh
    Busso, Carlos
    SPEECH COMMUNICATION, 2014, 57 : 1 - 12
  • [18] AUTOMATED SPEECH RECOGNITION SYSTEM FOR SPEAKER EMOTION CLASSIFICATION
    Anithadevi, N.
    Gokul, P.
    Nandan, S. Muhil
    Magesh, R.
    Shiddharth, S.
    PROCEEDINGS OF THE 2020 5TH INTERNATIONAL CONFERENCE ON COMPUTING, COMMUNICATION AND SECURITY (ICCCS-2020), 2020,
  • [19] SPEAKER VARIABILITY IN EMOTION RECOGNITION - AN ADAPTATION BASED APPROACH
    Ding, Ni
    Sethu, Vidhyasaharan
    Epps, Julien
    Ambikairajah, Eliathamby
    2012 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2012, : 5101 - 5104
  • [20] Unsupervised Adversarial Domain Adaptation for Cross-Lingual Speech Emotion Recognition
    Latif, Siddique
    Qadir, Junaid
    Bilal, Muhammad
    2019 8TH INTERNATIONAL CONFERENCE ON AFFECTIVE COMPUTING AND INTELLIGENT INTERACTION (ACII), 2019,