Statistical methods in data-driven modeling of Spanish prosody for text to speech

被引:0
作者
LopezGonzalo, E
RodriguezGarcia, JM
机构
来源
ICSLP 96 - FOURTH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, VOLS 1-4 | 1996年
关键词
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In [1], we proposed an automatic data-driven methodology to model both fundamental frequency and segmental duration in TTS converters from a monospeaker recorded corpus. Therefore, it had the advantage that could be adapted to a specific corpus or a particular speaker. The main disadvantage was the size of the obtained prosodic database. In this paper, we propose to use some statistical methods for reducing the prosodic database required in this methodology. A 50% of reduction can be obtained without compromising the naturalness of the synthetic speech obtained by our previous methodology with the same prosodic corpus. A compromise between variability and reduction in prosodic: contours is also discussed.
引用
收藏
页码:1377 / 1380
页数:4
相关论文
empty
未找到相关数据