Cross-Speaker Emotion Transfer Through Information Perturbation in Emotional Speech Synthesis

被引：5

作者：

Lei, Yi ^{[1
]}

Yang, Shan ^{[2
]}

Zhu, Xinfa ^{[1
]}

Xie, Lei ^{[1
]}

Su, Dan ^{[2
]}

机构：

[1] Northwestern Polytech Univ, Xian 710129, Peoples R China

[2] Tencent AI Lab, Beijing 100086, Peoples R China

来源：

IEEE SIGNAL PROCESSING LETTERS | 2022年 / 29卷

基金：

国家重点研发计划;

关键词：

Timbre; Spectrogram; Perturbation methods; Generators; Speech synthesis; Adaptation models; Acoustics; Cross-speaker emotion transfer; emotional TTS; information perturbation; speech synthesis; RECOGNITION;

D O I：

10.1109/LSP.2022.3203888

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Through borrowing emotional expressions from an emotional speaker, cross-speaker emotion transfer is an effective way to produce emotional speech for target speakers without emotional training data. Since emotion and timbre of the source speaker are heavily entangled in speech, existing approaches often struggle to trade off between speaker similarity and emotional expression in the synthetic speech of the target speaker. In this letter, we propose to disentangle timbre and emotion through information perturbation to conduct cross-speaker emotion transfer, which effectively learns the emotional expression of the source speaker and maintains the timbre of the target speaker. Specifically, we separately perturb the timbre and emotion-related features (e.g., formant and pitch) of source speech to obtain and model the timbre- and emotion-independent signals, based on which the proposed model can deliver the emotional expression for target speakers. Experimental results demonstrate the proposed approach significantly outperforms the baselines in terms of naturalness and similarity, indicating the effectiveness of information perturbation for cross-speaker emotion transfer.

引用

页码：1948 / 1952

页数：5

共 50 条

[21] CROSS-SPEAKER SILENT-SPEECH COMMAND WORD RECOGNITION USING ELECTRO-OPTICAL STOMATOGRAPHY
Stone, Simon
Birkholz, Peter
2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2020, : 7849 - 7853
[22] FINE-GRAINED EMOTION STRENGTH TRANSFER, CONTROL AND PREDICTION FOR EMOTIONAL SPEECH SYNTHESIS
Lei, Yi
Yang, Shan
Xie, Lei
2021 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP (SLT), 2021, : 423 - 430
[23] MsEmoTTS: Multi-Scale Emotion Transfer, Prediction, and Control for Emotional Speech Synthesis
Lei, Yi
Yang, Shan
Wang, Xinsheng
Xie, Lei
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2022, 30 : 853 - 864
[24] Exploiting Emotion Information in Speaker Embeddings for Expressive Text-to-Speech
Shaheen, Zein
Sadekova, Tasnima
Matveeva, Yulia
Shirshova, Alexandra
Kudinov, Mikhail
INTERSPEECH 2023, 2023, : 2038 - 2042
[25] Speaker and gender dependencies in within/cross linguistic Speech Emotion Recognition
Chakhtouna A.
Sekkate S.
Adib A.
International Journal of Speech Technology, 2023, 26 (03) : 609 - 625
[26] A Method for Emotional Speech Synthesis Based on Speaker Adaptive Training
Lu, Xiaoyong
Li, Yanqin
Yang, Hongwu
2018 11TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2018, : 31 - 35
[27] A DNN-based emotional speech synthesis by speaker adaptation
Yang, Hongwu
Zhang, Weizhao
Zhi, Pengpeng
2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 633 - 637
[28] Estimating Mutual Information in Prosody Representation for Emotional Prosody Transfer in Speech Synthesis
Zhang, Guangyan
Qiu, Shirong
Qin, Ying
Lee, Tan
2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
[29] Speaker Dependent, Speaker Independent and Cross Language Emotion Recognition From Speech Using GMM and HMM
Bhaykar, Manav
Yadav, Jainath
Rao, K. Sreenivasa
2013 NATIONAL CONFERENCE ON COMMUNICATIONS (NCC), 2013,
[30] SPEAKER VGG CCT: Cross-corpus Speech Emotion Recognition with Speaker Embedding and Vision Transformers
Arezzo, Alessandro
Berretti, Stefano
PROCEEDINGS OF THE 4TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA IN ASIA, MMASIA 2022, 2022,

← 1 2 3 4 5 →