Non-parallel training for voice conversion by maximum likelihood constrained adaptation

被引:0
|
作者
Mouchtaris, A [1 ]
Van der Spiegel, J [1 ]
Mueller, P [1 ]
机构
[1] Univ Penn, Dept Elect & Syst Engn, Philadelphia, PA 19104 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The objective of voice conversion methods is to modify the speech characteristics of a particular speaker in such manner, as to sound like speech by a different target speaker. Current voice conversion algorithms are based on deriving a conversion function by estimating its parameters through a corpus that contains the same utterances spoken by both speakers. Such a corpus, usually referred to as a parallel corpus, has the disadvantage that many times it is difficult or even impossible to collect. Here, we propose a voice conversion method that does not require a parallel corpus for training, i.e. the spoken utterances by the two speakers need not be the same, by employing speaker adaptation techniques to adapt to a particular pair of source and target speakers, the derived conversion parameters from a different pair of speakers. We show that adaptation reduces the error obtained when simply applying the conversion parameters of one pair of speakers to another by a factor that can reach 30% in many cases, and with performance comparable with the ideal case when a parallel corpus is available.
引用
收藏
页码:1 / 4
页数:4
相关论文
共 50 条
  • [21] CAB: An Energy-Based Speaker Clustering Model for Rapid Adaptation in Non-Parallel Voice Conversion
    Nakashika, Toru
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3369 - 3373
  • [22] Non-parallel training for voice conversion using background-based alignment of GMMs and INCA algorithm
    Ghorbandoost, Mostafa
    Saba, Valiallah
    IET SIGNAL PROCESSING, 2017, 11 (08) : 998 - 1005
  • [23] MoCoVC: Non-parallel Voice Conversion with Momentum Contrastive Representation Learning
    Onishi, Kotaro
    Nakashika, Toru
    PROCEEDINGS OF 2022 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2022, : 1438 - 1443
  • [24] Non-parallel Voice Conversion using Weighted Generative Adversarial Networks
    Paul, Dipjyoti
    Pantazis, Yannis
    Stylianou, Yannis
    INTERSPEECH 2019, 2019, : 659 - 663
  • [25] A Speaker-Dependent WaveNet for Voice Conversion with Non-Parallel Data
    Tian, Xiaohai
    Chng, Eng Siong
    Li, Haizhou
    INTERSPEECH 2019, 2019, : 201 - 205
  • [26] Effects of Sinusoidal Model on Non-Parallel Voice Conversion with Adversarial Learning
    Al-Radhi, Mohammed Salah
    Csapo, Tamas Gabor
    Nemeth, Geza
    APPLIED SCIENCES-BASEL, 2021, 11 (16):
  • [27] Non-parallel Sequence-to-Sequence Voice Conversion for Arbitrary Speakers
    Zhang, Ying
    Che, Hao
    Wang, Xiaorui
    2021 12TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP), 2021,
  • [28] NON-PARALLEL TRAINING FOR MANY-TO-MANY EIGENVOICE CONVERSION
    Ohtani, Yamato
    Toda, Tomoki
    Saruwatari, Hiroshi
    Shikano, Kiyohiro
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4822 - 4825
  • [29] Many-to-Many Voice Conversion based on Bottleneck Features with Variational Autoencoder for Non-parallel Training Data
    Li, Yanping
    Lee, Kong Aik
    Yuan, Yougen
    Li, Haizhou
    Yang, Zhen
    2018 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2018, : 829 - 833
  • [30] Non-parallel Voice Conversion with Controllable Speaker Individuality using Variational Autoencoder
    Tuan Vu Ho
    Akagi, Masato
    2019 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2019, : 106 - 111