Cross-Corpus Speech Emotion Recognition Based on Deep Domain-Adaptive Convolutional Neural Network

被引：18

作者：

Liu, Jiateng ^{[1
]}

Zheng, Wenming ^{[1
]}

Zong, Yuan ^{[1
]}

Lu, Cheng ^{[2
]}

Tang, Chuangao ^{[1
]}

机构：

[1] Southeast Univ, Minist Educ, Key Lab Child Dev & Learning Sci, Nanjing 210096, Peoples R China

[2] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Peoples R China

来源：

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS | 2020年 / E103D卷 / 02期

基金：

中国国家自然科学基金;

关键词：

cross-corpus speech emotion recognition; deep convolutional neural network; domain adaptation;

D O I：

10.1587/transinf.2019EDL8136

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this letter, we propose a novel deep domain-adaptive convolutional neural network (DDACNN) model to handle the challenging cross-corpus speech emotion recognition (SER) problem. The framework of the DDACNN model consists of two components: a feature extraction model based on a deep convolutional neural network (DCNN) and a domain-adaptive (DA) layer added in the DCNN utilizing the maximum mean discrepancy (MMD) criterion. We use labeled spectrograms from source speech corpus combined with unlabeled spectrograms from target speech corpus as the input of two classic DCNNs to extract the emotional features of speech, and train the model with a special mixed loss combined with a cross-entrophy loss and an MMD loss. Compared to other classic cross-corpus SER methods, the major advantage of the DDACNN model is that it can extract robust speech features which are time-frequency related by spectrograms and narrow the discrepancies between feature distribution of source corpus and target corpus to get better cross-corpus performance. Through several cross-corpus SER experiments, our DDACNN achieved the state-of-the-art performance on three public emotion speech corpora and is proved to handle the cross-corpus SER problem efficiently.

引用

页码：459 / 463

页数：5

共 50 条

[41] Transferable discriminant linear regression for cross-corpus speech emotion recognition
Li, Shaokai
Song, Peng
Zhang, Wenjing
APPLIED ACOUSTICS, 2022, 197
[42] Transfer Subspace Learning for Unsupervised Cross-Corpus Speech Emotion Recognition
Liu, Na
Zhang, Baofeng
Liu, Bin
Shi, Jingang
Yang, Lei
Li, Zhiwei
Zhu, Junchao
IEEE ACCESS, 2021, 9 : 95925 - 95937
[43] Improvement of Speech Emotion Recognition by Deep Convolutional Neural Network and Speech Features
Mohanty, Aniruddha
Cherukuri, Ravindranath C.
Prusty, Alok Ranjan
THIRD CONGRESS ON INTELLIGENT SYSTEMS, CIS 2022, VOL 1, 2023, 608 : 117 - 129
[44] Few Shot Learning Guided by Emotion Distance for Cross-corpus Speech Emotion Recognition
Yue, Pengcheng
Wu, Yanfeng
Qu, Leyuan
Zheng, Shukai
Zhao, Shuyuan
Li, Taihao
2023 ASIA PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE, APSIPA ASC, 2023, : 1008 - 1012
[45] Deep and shallow features fusion based on deep convolutional neural network for speech emotion recognition
Sun L.
Chen J.
Xie K.
Gu T.
International Journal of Speech Technology, 2018, 21 (04) : 931 - 940
[46] Speech Emotion Recognition from Spectrograms with Deep Convolutional Neural Network
Badshah, Abdul Malik
Ahmad, Jamil
Rahim, Nasir
Baik, Sung Wook
2017 INTERNATIONAL CONFERENCE ON PLATFORM TECHNOLOGY AND SERVICE (PLATCON), 2017, : 125 - 129
[47] Speech emotion recognition based on spiking neural network and convolutional neural network
Du, Chengyan
Liu, Fu
Kang, Bing
Hou, Tao
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 147
[48] Speech Emotion Recognition Based on Multiple Acoustic Features and Deep Convolutional Neural Network
Bhangale, Kishor
Kothandaraman, Mohanaprasad
ELECTRONICS, 2023, 12 (04)
[49] Deep Architecture Enhancing Robustness to Noise, Adversarial Attacks, and Cross-corpus Setting for Speech Emotion Recognition
Latif, Siddique
Rana, Rajib
Khalifa, Sara
Jurdak, Raja
Schuller, Bjoern W.
INTERSPEECH 2020, 2020, : 2327 - 2331
[50] Speech Emotion Recognition Based on Deep Neural Network
Zhu, Zijiang
Hu, Yi
Li, Junshan
Li, Jianjun
Wang, Junhua
BASIC & CLINICAL PHARMACOLOGY & TOXICOLOGY, 2020, 126 : 154 - 154

← 1 2 3 4 5 →