Cross-Corpus Speech Emotion Recognition Based on Deep Domain-Adaptive Convolutional Neural Network

被引:18
|
作者
Liu, Jiateng [1 ]
Zheng, Wenming [1 ]
Zong, Yuan [1 ]
Lu, Cheng [2 ]
Tang, Chuangao [1 ]
机构
[1] Southeast Univ, Minist Educ, Key Lab Child Dev & Learning Sci, Nanjing 210096, Peoples R China
[2] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Peoples R China
基金
中国国家自然科学基金;
关键词
cross-corpus speech emotion recognition; deep convolutional neural network; domain adaptation;
D O I
10.1587/transinf.2019EDL8136
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this letter, we propose a novel deep domain-adaptive convolutional neural network (DDACNN) model to handle the challenging cross-corpus speech emotion recognition (SER) problem. The framework of the DDACNN model consists of two components: a feature extraction model based on a deep convolutional neural network (DCNN) and a domain-adaptive (DA) layer added in the DCNN utilizing the maximum mean discrepancy (MMD) criterion. We use labeled spectrograms from source speech corpus combined with unlabeled spectrograms from target speech corpus as the input of two classic DCNNs to extract the emotional features of speech, and train the model with a special mixed loss combined with a cross-entrophy loss and an MMD loss. Compared to other classic cross-corpus SER methods, the major advantage of the DDACNN model is that it can extract robust speech features which are time-frequency related by spectrograms and narrow the discrepancies between feature distribution of source corpus and target corpus to get better cross-corpus performance. Through several cross-corpus SER experiments, our DDACNN achieved the state-of-the-art performance on three public emotion speech corpora and is proved to handle the cross-corpus SER problem efficiently.
引用
收藏
页码:459 / 463
页数:5
相关论文
共 50 条
  • [1] Cross-Corpus Speech Emotion Recognition Based on Domain-Adaptive Least-Squares Regression
    Zong, Yuan
    Zheng, Wenming
    Zhang, Tong
    Huang, Xiaohua
    IEEE SIGNAL PROCESSING LETTERS, 2016, 23 (05) : 585 - 589
  • [2] UNSUPERVISED CROSS-CORPUS SPEECH EMOTION RECOGNITION USING DOMAIN-ADAPTIVE SUBSPACE LEARNING
    Liu, Na
    Zong, Yuan
    Zhang, Baofeng
    Liu, Li
    Chen, Jie
    Zhao, Guoying
    Zhu, Junchao
    2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2018, : 5144 - 5148
  • [3] DOMAIN GENERALIZATION WITH TRIPLET NETWORK FOR CROSS-CORPUS SPEECH EMOTION RECOGNITION
    Lee, Shi-wook
    2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 389 - 396
  • [4] Domain Generalization with Triplet Network for Cross-Corpus Speech Emotion Recognition
    Lee, Shi-Wook
    2021 IEEE Spoken Language Technology Workshop, SLT 2021 - Proceedings, 2021, : 389 - 396
  • [5] Convolutional neural network-based cross-corpus speech emotion recognition with data augmentation and features fusion
    Jahangir, Rashid
    Teh, Ying Wah
    Mujtaba, Ghulam
    Alroobaea, Roobaea
    Shaikh, Zahid Hussain
    Ali, Ihsan
    MACHINE VISION AND APPLICATIONS, 2022, 33 (03)
  • [6] Convolutional neural network-based cross-corpus speech emotion recognition with data augmentation and features fusion
    Rashid Jahangir
    Ying Wah Teh
    Ghulam Mujtaba
    Roobaea Alroobaea
    Zahid Hussain Shaikh
    Ihsan Ali
    Machine Vision and Applications, 2022, 33
  • [7] Cross-Corpus Speech Emotion Recognition Based on Hybrid Neural Networks
    Rehman, Abdul
    Liu, Zhen-Tao
    Li, Dan-Yun
    Wu, Bao-Han
    PROCEEDINGS OF THE 39TH CHINESE CONTROL CONFERENCE, 2020, : 7464 - 7468
  • [8] Deep Transductive Transfer Regression Network for Cross-Corpus Speech Emotion Recognition
    Zhao, Yan
    Wang, Jincen
    Ye, Ru
    Zong, Yuan
    Zheng, Wenming
    Zhao, Li
    INTERSPEECH 2022, 2022, : 371 - 375
  • [9] Improved Cross-Corpus Speech Emotion Recognition Using Deep Local Domain Adaptation
    ZHAO Huijuan
    YE Ning
    WANG Ruchuan
    ChineseJournalofElectronics, 2023, 32 (03) : 640 - 646
  • [10] Convolutional Auto-Encoder and Adversarial Domain Adaptation for Cross-Corpus Speech Emotion Recognition
    Wang, Yang
    Fu, Hongliang
    Tao, Huawei
    Yang, Jing
    Ge, Hongyi
    Xie, Yue
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2022, E105D (10) : 1803 - 1806