NON-PARALLEL TRAINING FOR VOICE CONVERSION BASED ON ADAPTATION METHOD

被引:0
|
作者
Song, Peng [1 ]
Zheng, Wenming [2 ]
Zhao, Li [1 ]
机构
[1] Southeast Univ, Sch Informat Sci & Engn, Nanjing 210096, Jiangsu, Peoples R China
[2] Southeast Univ, Res Ctr Learning Sci, Nanjing 210096, Jiangsu, Peoples R China
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
关键词
Voice conversion; non-parallel training; MAP; Gaussian normalization; mean transformation;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we propose a simple and efficient non-parallel training scheme for voice conversion (VC). First, the speaker models are adapted from the background model using maximum a posteriori (MAP) technique. Then, by utilizing the parameters of adapted speaker models, the Gaussian normalization and mean transformation methods are proposed for VC, respectively. In addition, to improve the conversion performance of the proposed methods, a combination approach is further presented. Finally, objective and subjective experiments are carried out to evaluate the performance of the proposed scheme, the results demonstrate that our scheme can obtain comparable performance with the traditional GMM method based on parallel corpus.
引用
收藏
页码:6905 / 6909
页数:5
相关论文
共 50 条
  • [1] Non-parallel training for voice conversion by maximum likelihood constrained adaptation
    Mouchtaris, A
    Van der Spiegel, J
    Mueller, P
    2004 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING, 2004, : 1 - 4
  • [2] A novel method for voice conversion based on non-parallel corpus
    Sayadian A.
    Mozaffari F.
    International Journal of Speech Technology, 2017, 20 (3) : 587 - 592
  • [3] NON-PARALLEL TRAINING FOR VOICE CONVERSION BASED ON FT-GMM
    Chen, Ling-Hui
    Ling, Zhen-Hua
    Dai, Li-Rong
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5116 - 5119
  • [4] SPEAKER ADAPTIVE MODEL BASED ON BOLTZMANN MACHINE FOR NON-PARALLEL TRAINING IN VOICE CONVERSION
    Nakashika, Torsi
    Minami, Yasuhiro
    2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING PROCEEDINGS, 2016, : 5530 - 5534
  • [5] SINGING VOICE CONVERSION WITH NON-PARALLEL DATA
    Chen, Xin
    Chu, Wei
    Guo, Jinxi
    Xu, Ning
    2019 2ND IEEE CONFERENCE ON MULTIMEDIA INFORMATION PROCESSING AND RETRIEVAL (MIPR 2019), 2019, : 292 - 296
  • [6] Non-Parallel Voice Conversion for ASR Augmentation
    Wang, Gary
    Rosenberg, Andrew
    Ramabhadran, Bhuvana
    Biadsy, Fadi
    Huang, Yinghui
    Emond, Jesse
    Mengibar, Pedro Moreno
    INTERSPEECH 2022, 2022, : 3408 - 3412
  • [7] MAP-BASED ADAPTATION FOR SPEECH CONVERSION USING ADAPTATION DATA SELECTION AND NON-PARALLEL TRAINING
    Lee, Chung-Han
    Wu, Chung-Hsien
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2254 - 2257
  • [8] Non-Parallel Training in Voice Conversion Using an Adaptive Restricted Boltzmann Machine
    Nakashika, Toru
    Takiguchi, Tetsuya
    Minami, Yasuhiro
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2016, 24 (11) : 2032 - 2045
  • [9] VAW-GAN for Singing Voice Conversion with Non-parallel Training Data
    Lu, Junchen
    Zhou, Kun
    Sisman, Berrak
    Li, Haizhou
    2020 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE (APSIPA ASC), 2020, : 514 - 519
  • [10] CAB: An Energy-Based Speaker Clustering Model for Rapid Adaptation in Non-Parallel Voice Conversion
    Nakashika, Toru
    18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION, 2017, : 3369 - 3373