Cross-Corpus Speech Emotion Recognition Based on Causal Emotion Information Representation

被引:0
|
作者
Fu, Hongliang [1 ]
Li, Qianqian [1 ]
Tao, Huawei [1 ]
Zhu, Chunhua [1 ]
Xie, Yue [2 ]
Guo, Ruxue [3 ]
机构
[1] Henan Univ Technol, Key Lab Grain Informat Proc & Control, Minist Educ, Zhengzhou 450001, Peoples R China
[2] Nanjing Inst Technol, Sch Commun Engn, Nanjing 211167, Peoples R China
[3] IFLYTEK Res, Hefei 230088, Peoples R China
基金
中国国家自然科学基金;
关键词
cross-corpus speech emotion recognition; causal representation learning; domain adaptation;
D O I
10.1587/transinf.2023EDL8087
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Speech emotion recognition (SER) is a key research technology to realize the third generation of artificial intelligence, which is widely used in human-computer interaction, emotion diagnosis, interpersonal communication and other fields. However, the aliasing of language and semantic information in speech tends to distort the alignment of emotion features, which affects the performance of cross-corpus SER system. This paper proposes a cross-corpus SER model based on causal emotion information representation (CEIR). The model uses the reconstruction loss of the deep autoencoder network and the source domain label information to realize the preliminary separation of causal features. Then, the causal correlation matrix is constructed, and the local maximum mean difference (LMMD) feature alignment technology is combined to make the causal features of different dimensions jointly distributed independent. Finally, the supervised fine-tuning of labeled data is used to achieve effective extraction of causal emotion information. The experimental results show that the average unweighted average recall (UAR) of the proposed algorithm is increased by 3.4% to 7.01% compared with the latest partial algorithms in the field.
引用
收藏
页码:1097 / 1100
页数:4
相关论文
共 50 条
  • [31] Transfer Sparse Discriminant Subspace Learning for Cross-Corpus Speech Emotion Recognition
    Zhang, Weijian
    Song, Peng
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 (28) : 307 - 318
  • [32] CROSS-CORPUS SPEECH EMOTION RECOGNITION USING JOINT DISTRIBUTION ADAPTIVE REGRESSION
    Zhang, Jiacheng
    Jiang, Lin
    Zong, Yuan
    Zheng, Wenming
    Zhao, Li
    2021 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP 2021), 2021, : 3790 - 3794
  • [33] A Novel DBN Feature Fusion Model for Cross-Corpus Speech Emotion Recognition
    Zou Cairong
    Zhang Xinran
    Zha Cheng
    Zhao Li
    JOURNAL OF ELECTRICAL AND COMPUTER ENGINEERING, 2016, 2016
  • [34] Cross-corpus speech emotion recognition based on transfer non-negative matrix factorization
    Song, Peng
    Zheng, Wenming
    Ou, Shifeng
    Zhang, Xinran
    Jin, Yun
    Liu, Jinglei
    Yu, Yanwei
    SPEECH COMMUNICATION, 2016, 83 : 34 - 41
  • [35] Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation
    Fu, Hongliang
    Zhuang, Zhihao
    Wang, Yang
    Huang, Chen
    Duan, Wenzhuo
    ENTROPY, 2023, 25 (01)
  • [36] Cross-Corpus Speech Emotion Recognition Based on Few-Shot Learning and Domain Adaptation
    Ahn, Youngdo
    Lee, Sung Joo
    Shin, Jong Won
    IEEE SIGNAL PROCESSING LETTERS, 2021, 28 : 1190 - 1194
  • [37] Nonnegative Matrix Factorization Based Transfer Subspace Learning for Cross-Corpus Speech Emotion Recognition
    Luo, Hui
    Han, Jiqing
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2020, 28 : 2047 - 2060
  • [38] Cross-Corpus Acoustic Emotion Recognition: Variances and Strategies
    Schuller, Bjoern
    Vlasenko, Bogdan
    Eyben, Florian
    Woellmer, Martin
    Stuhlsatz, Andre
    Wendemuth, Andreas
    Rigoll, Gerhard
    IEEE TRANSACTIONS ON AFFECTIVE COMPUTING, 2010, 1 (02) : 119 - 131
  • [39] Exploring corpus-invariant emotional acoustic feature for cross-corpus speech emotion recognition
    Lian, Hailun
    Lu, Cheng
    Zhao, Yan
    Li, Sunan
    Qi, Tianhua
    Zong, Yuan
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 258
  • [40] LSTM based Cross-corpus and Cross-task Acoustic Emotion Recognition
    Kaya, Heysem
    Fedotov, Dmitrii
    Yesilkanat, Ali
    Verkholyak, Oxana
    Zhang, Yang
    Karpov, Alexey
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 521 - 525