A COMPARISON OF SUPERVISED AND UNSUPERVISED CROSS-LINGUAL SPEAKER ADAPTATION APPROACHES FOR HMM-BASED SPEECH SYNTHESIS

被引：17

作者：

Liang, Hui ^{[1
]}

Dines, John ^{[1
]}

Saheer, Lakshmi ^{[1
]}

机构：

[1] Idiap Res Inst, Martigny, Switzerland

来源：

2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING | 2010年

关键词：

unsupervised cross-lingual speaker adaptation; decision tree marginalization; HMM state mapping; ALGORITHMS;

D O I：

10.1109/ICASSP.2010.5495559

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The EMIME project aims to build a personalized speech-to-speech translator, such that spoken input of a user in one language is used to produce spoken output that still sounds like the user's voice however in another language. This distinctiveness makes unsupervised cross-lingual speaker adaptation one key to the project's success. So far, research has been conducted into unsupervised and cross-lingual cases separately by means of decision tree marginalization and HMM state mapping respectively. In this paper we combine the two techniques to perform unsupervised cross-lingual speaker adaptation. The performance of eight speaker adaptation systems (supervised vs. unsupervised, intra-lingual vs. cross-lingual) is compared using objective and subjective evaluations. Experimental results show the performance of unsupervised cross-lingual speaker adaptation is comparable to that of the supervised case in terms of spectrum adaptation in the EMIME scenario, even though automatically obtained transcriptions have a very high phoneme error rate.

引用

页码：4598 / 4601

页数：4

共 7 条

[1] [Anonymous], 1999, P EUROSPEECH
[2] DINES J, 2009, P INTERSPEECH, P1391
[3] DINES J, 2009, P INTERSPEECH, P1395
[4] A Cross-Language State Sharing and Mapping Approach to Bilingual (Mandarin-English) TTS
Qian, Yao
Liang, Hui
Soong, Frank K.
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (06): : 1231 - 1239
[5] Tokuda K, 2000, INT CONF ACOUST SPEE, P1315, DOI 10.1109/ICASSP.2000.861820
[6] Wu YJ, 2009, INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, P516
[7] Analysis of Speaker Adaptation Algorithms for HMM-Based Speech Synthesis and a Constrained SMAPLR Adaptation Algorithm
Yamagishi, Junichi
Kobayashi, Takao
Nakano, Yuji
Ogata, Katsumi
Isogai, Juri
[J]. IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2009, 17 (01): : 66 - 83

← 1 →