A COMPARISON OF SUPERVISED AND UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION APPROACHES FOR HMM-BASED SPEECH SYNTHESIS

被引:17
作者
Liang, Hui [1 ]
Dines, John [1 ]
Saheer, Lakshmi [1 ]
机构
[1] Idiap Res Inst, Martigny, Switzerland
来源
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年
关键词
unsupervised cross-lingual speaker adaptation; decision tree marginalization; HMM state mapping; ALGORITHMS;
D O I
10.1109/ICASSP.2010.5495559
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The EMIME project aims to build a personalized speech-to-speech translator, such that spoken input of a user in one language is used to produce spoken output that still sounds like the user's voice however in another language. This distinctiveness makes unsupervised cross-lingual speaker adaptation one key to the project's success. So far, research has been conducted into unsupervised and cross-lingual cases separately by means of decision tree marginalization and HMM state mapping respectively. In this paper we combine the two techniques to perform unsupervised cross-lingual speaker adaptation. The performance of eight speaker adaptation systems (supervised vs. unsupervised, intra-lingual vs. cross-lingual) is compared using objective and subjective evaluations. Experimental results show the performance of unsupervised cross-lingual speaker adaptation is comparable to that of the supervised case in terms of spectrum adaptation in the EMIME scenario, even though automatically obtained transcriptions have a very high phoneme error rate.
引用
收藏
页码:4598 / 4601
页数:4
相关论文
共 7 条
  • [1] [Anonymous], 1999, P EUROSPEECH
  • [2] DINES J, 2009, P INTERSPEECH, P1391
  • [3] DINES J, 2009, P INTERSPEECH, P1395
  • [4] A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin-English) TTS
    Qian, Yao
    Liang, Hui
    Soong, Frank K.
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (06): : 1231 - 1239
  • [5] Tokuda K, 2000, INT CONF ACOUST SPEE, P1315, DOI 10.1109/ICASSP.2000.861820
  • [6] Wu YJ, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P516
  • [7] Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm
    Yamagishi, Junichi
    Kobayashi, Takao
    Nakano, Yuji
    Ogata, Katsumi
    Isogai, Juri
    [J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (01): : 66 - 83