LSM-based unit pruning for concatenative speech synthesis

被引:0
作者
Bellegarda, Jerome R. [1 ]
机构
[1] Apple Comp Inc, Speech & Language Technol, Cupertino, CA 95014 USA
来源
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年
关键词
text-to-speech synthesis; unit selection; inventory pruning; outlier removal; unit redundancy management;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The level of quality that can be achieved in concatenative text-to-speech synthesis is primarily governed by the inventory of units used in unit selection. This has led to the collection of ever larger corpora in the quest for ever more natural synthetic speech. As operational considerations limit the size of the unit inventory, however, pruning is critical to removing any instances that prove either spurious or superfluous. This paper proposes a novel pruning strategy based on a data-driven feature extraction framework separately optimized for each unit type in the inventory. A single distinctiveness/redundancy measure can then address, in a consistent manner, the (traditionally separate) problems of outliers and redundant units. Experimental results underscore the viability of this approach for both moderate and aggressive inventory pruning.
引用
收藏
页码:521 / 524
页数:4
相关论文
共 14 条
[1]  
[Anonymous], 1997, Eurospeech97
[2]  
BALESTRI M, 1999, P 6 EUR C SPEECH COM, P2291
[3]   A global, boundary-centric framework for unit selection text-to-speech synthesis [J].
Bellegarda, JR .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (03) :990-997
[4]   Latent semantic mapping [J].
Bellegarda, JR .
IEEE SIGNAL PROCESSING MAGAZINE, 2005, 22 (05) :70-80
[5]   Statistical prosodic modeling: From corpus design to parameter estimation [J].
Bellegarda, JR ;
Silverman, KEA ;
Lenzo, K ;
Anderson, V .
IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (01) :52-66
[6]  
BELLEGARDA JR, 2004, P 5 ISCA SPEECH SYNT, P133
[7]  
BELLEGARDA JR, 2007, IEEE T AUD SPEECH LA, V15
[8]  
Beutnagel M, 1999, P 137 M AC SOC AM, P18
[9]  
BLACK AW, 2001, P 4 ISCA SPEECH SYNT, V129
[10]  
Campbell W. N., 1997, PROGR SPEECH SYNTHES, P279