Trainable Cantonese/English dual language speech synthesis system

被引：0

作者：

Li, HP ^{[1
]}

Chen, FX ^{[1
]}

Shen, LQ ^{[1
]}

Ma, XJ ^{[1
]}

机构：

[1] IBM Corp, China Res Lab, Beijing 10085, Peoples R China

来源：

2003 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL I, PROCEEDINGS: SPEECH PROCESSING I | 2003年

关键词：

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

The Cantonese/English dual language Text To Speech (TTS) system introduced in this paper was developed on IBM's trainable TTS technology, which uses trainable statistical models to automate speech data processing and selection. The Cantonese and English phonological, syntactic and prosodic rules were built into a dual-language Delta module, which processes the mixed-language input accordingly and generates mixed Cantonese and English speech with coherent prosody. To approximate the speaker's characteristics, a speaker prosody profile was extracted from the dataset and incorporated into Delta speech rule processing for the enhancement of duration, lexical tone and intonation prediction. In selection of the concatenative unit set, different Cantonese syllable decomposition schemes were experimented. Though this system is currently only implemented for Cantonese, it can be easily adapted to other tonal languages.

引用

页码：508 / 511

页数：4