Discourse Prosody and Its Application to Speech Synthesis

被引:0
作者
Hu, Na [1 ]
Shao, Pengfei [1 ]
Zu, Yiqing [1 ]
Wang, Zuyan [1 ]
Huang, Wei [1 ]
Wang, Shijin [1 ]
机构
[1] iFLYTEK Res, Hefei, Peoples R China
来源
2016 10TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING (ISCSLP) | 2016年
关键词
speech synthesis; Rhetorical Structure Theory; discourse prosody; BOUNDARIES;
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper reveals the correlations between discourse structure and acoustic parameters and presents a method of manipulating discourse prosody in relation to discourse structure to improve the naturalness of synthesis speech. The text material included 1229 passages. The texts were annotated using Rhetorical Structure Theory. Prosody measurements were extracted from the corresponding speech annotation and then the statistic analysis were conducted. The results showed that: 1) segments at higher hierarchical level were preceded with longer pause durations; 2) segments bearing nucleus possessed longer average duration than satellites did. To test if rhetorical structure would benefit synthesized speech prosody, 15 passages were synthesized with discourse features implemented. The evaluation results indicated that the modified synthesis speech excelled the baseline system by 0.1 MOS point, suggesting that implementing prosodic features into synthesized speech would improve overall prosody.
引用
收藏
页数:5
相关论文
共 15 条