Non-parallel training for voice conversion by maximum likelihood constrained adaptation

被引:0
|
作者
Mouchtaris, A [1 ]
Van der Spiegel, J [1 ]
Mueller, P [1 ]
机构
[1] Univ Penn, Dept Elect & Syst Engn, Philadelphia, PA 19104 USA
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The objective of voice conversion methods is to modify the speech characteristics of a particular speaker in such manner, as to sound like speech by a different target speaker. Current voice conversion algorithms are based on deriving a conversion function by estimating its parameters through a corpus that contains the same utterances spoken by both speakers. Such a corpus, usually referred to as a parallel corpus, has the disadvantage that many times it is difficult or even impossible to collect. Here, we propose a voice conversion method that does not require a parallel corpus for training, i.e. the spoken utterances by the two speakers need not be the same, by employing speaker adaptation techniques to adapt to a particular pair of source and target speakers, the derived conversion parameters from a different pair of speakers. We show that adaptation reduces the error obtained when simply applying the conversion parameters of one pair of speakers to another by a factor that can reach 30% in many cases, and with performance comparable with the ideal case when a parallel corpus is available.
引用
收藏
页码:1 / 4
页数:4
相关论文
共 50 条
  • [1] NON-PARALLEL TRAINING FOR VOICE CONVERSION BASED ON ADAPTATION METHOD
    Song, Peng
    Zheng, Wenming
    Zhao, Li
    2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2013, : 6905 - 6909
  • [2] NON-PARALLEL TRAINING FOR VOICE CONVERSION BASED ON FT-GMM
    Chen, Ling-Hui
    Ling, Zhen-Hua
    Dai, Li-Rong
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5116 - 5119
  • [3] SINGING VOICE CONVERSION WITH NON-PARALLEL DATA
    Chen, Xin
    Chu, Wei
    Guo, Jinxi
    Xu, Ning
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 292 - 296
  • [4] Non-Parallel Voice Conversion for ASR Augmentation
    Wang, Gary
    Rosenberg, Andrew
    Ramabhadran, Bhuvana
    Biadsy, Fadi
    Huang, Yinghui
    Emond, Jesse
    Mengibar, Pedro Moreno
    INTERSPEECH 2022, 2022, : 3408 - 3412
  • [5] Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine
    Nakashika, Toru
    Takiguchi, Tetsuya
    Minami, Yasuhiro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2032 - 2045
  • [6] VAW-GAN for Singing Voice Conversion with Non-parallel Training Data
    Lu, Junchen
    Zhou, Kun
    Sisman, Berrak
    Li, Haizhou
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 514 - 519
  • [7] CVC: Contrastive Learning for Non-parallel Voice Conversion
    Li, Tingle
    Liu, Yichen
    Hu, Chenxu
    Zhao, Hang
    INTERSPEECH 2021, 2021, : 1324 - 1328
  • [8] NOVEL METRIC LEARNING FOR NON-PARALLEL VOICE CONVERSION
    Shah, Nirmesh J.
    Patil, Hemant A.
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 3722 - 3726
  • [9] Frame Labeling and Mapping for Non-parallel Voice Conversion
    Dong, Minghui
    Yang, Chenyu
    Ehnes, Jochen Walter
    Lu, Yanfeng
    Ming, Huaiping
    Huang, Dongyan
    2017 IEEE 2ND INTERNATIONAL CONFERENCE ON SIGNAL AND IMAGE PROCESSING (ICSIP), 2017, : 361 - 365
  • [10] Non-parallel Voice Conversion with Generative Attentional Networks
    Chiu, Tse Wei
    Guo, You Sheng
    Chang, Pao-Chi
    2021 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2021, : 141 - 145