Building Text Corpus for Unit Selection Synthesis

被引:4
作者
Kasparaitis, Pijus [1 ]
Anbinderis, Tomas [1 ]
机构
[1] Vilnius State Univ, Fac Math & Informat, Dept Comp Sci 2, LT-03225 Vilnius, Lithuania
关键词
text-to-speech synthesis; unit selection; greedy algorithm; SPEECH SYNTHESIS;
D O I
10.15388/Informatica.2014.29
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The present paper deals with building the text corpus for unit selection text-to-speech synthesis. During synthesis the target and concatenation costs are calculated and these costs are usually based on the prosodic and acoustic features of sounds. If the cost calculation is moved to the phonological level, it is possible to simulate unit selection synthesis without any real recordings; in this case text transcriptions are sufficient. We propose to use the cost calculated during the test data synthesis simulation to evaluate the text corpus quality. The greedy algorithm that maximizes coverage of certain phonetic units will be used to build the corpus. In this work the corpora optimized to cover phonetic units of different size and weight are evaluated.
引用
收藏
页码:551 / 562
页数:12
相关论文
共 14 条
[1]  
Anbinderis T., 2009, KALBU STUDIJOS STUDI, V14, P25
[2]  
BOZKURT B, 2003, EUROSPEECH 2003, P277
[3]  
Breen A. P., 1998, P 3 ESCA WORKSH SPEE, P373
[4]  
Buchsbaum A., 1997, EUR 1997, P553
[5]  
Francois H., 2001, INT 2001, P829
[6]  
Francois Helene, 2002, P LREC 2002, P1420
[7]  
Hunt AJ, 1996, INT CONF ACOUST SPEE, P373, DOI 10.1109/ICASSP.1996.541110
[8]  
Kasparaitis P, 2005, INFORMATICA-LITHUAN, V16, P193
[9]  
Pyz G, 2014, INFORMATICA-LITHUAN, V25, P55
[10]  
Pyz G, 2011, INFORMATICA-LITHUAN, V22, P411