LSM-based unit pruning for concatenative speech synthesis

被引:0
|
作者
Bellegarda, Jerome R. [1 ]
机构
[1] Apple Comp Inc, Speech & Language Technol, Cupertino, CA 95014 USA
来源
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年
关键词
text-to-speech synthesis; unit selection; inventory pruning; outlier removal; unit redundancy management;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The level of quality that can be achieved in concatenative text-to-speech synthesis is primarily governed by the inventory of units used in unit selection. This has led to the collection of ever larger corpora in the quest for ever more natural synthetic speech. As operational considerations limit the size of the unit inventory, however, pruning is critical to removing any instances that prove either spurious or superfluous. This paper proposes a novel pruning strategy based on a data-driven feature extraction framework separately optimized for each unit type in the inventory. A single distinctiveness/redundancy measure can then address, in a consistent manner, the (traditionally separate) problems of outliers and redundant units. Experimental results underscore the viability of this approach for both moderate and aggressive inventory pruning.
引用
收藏
页码:521 / 524
页数:4
相关论文
共 50 条
  • [41] Expressive Prosody for Unit-selection Speech Synthesis
    Strom, Volker
    Clark, Robert
    King, Simon
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1296 - 1299
  • [42] A classifier-based target cost for unit selection speech synthesis trained on perceptual data
    Strom, Volker
    King, Simon
    11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 1-2, 2010, : 150 - 153
  • [43] A global, boundary-centric framework for unit selection text-to-speech synthesis
    Bellegarda, JR
    IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (03): : 990 - 997
  • [44] Analysis of statistical parametric and unit selection speech synthesis systems applied to emotional speech
    Barra-Chicote, Roberto
    Yamagishi, Junichi
    King, Simon
    Manuel Montero, Juan
    Macias-Guarasa, Javier
    SPEECH COMMUNICATION, 2010, 52 (05) : 394 - 404
  • [45] OPTIMIZATION OF COST FUNCTION WEIGHTS FOR UNIT SELECTION SPEECH SYNTHESIS USING SPEECH RECOGNITION
    Pobar, Miran
    Martincic-Ipsic, Sanda
    Ipsic, Ivo
    NEURAL NETWORK WORLD, 2012, 22 (05) : 429 - 441
  • [46] Unifying Unit Selection and Hidden Markov Model Speech Synthesis
    Taylor, Paul
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1758 - 1761
  • [47] Phone-Level Embeddings for Unit Selection Speech Synthesis
    Perquin, Antoine
    Lecorve, Gwenole
    Lolive, Damien
    Amsaleg, Laurent
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2018, 2018, 11171 : 21 - 31
  • [48] On the Impact of Annotation Errors on Unit-Selection Speech Synthesis
    Matousek, Jindrich
    Tihelka, Daniel
    Smidl, Lubos
    TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 456 - 463
  • [49] Optimal Utterance Selection for Unit Selection Speech Synthesis Databases
    Alan W. Black
    Kevin Lenzo
    International Journal of Speech Technology, 2003, 6 (4) : 357 - 363
  • [50] PREDICTING SPECTRAL AND PROSODIC PARAMETERS FOR UNIT SELECTION IN SPEECH SYNTHESIS
    Dong, Minghui
    Li, Haizhou
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 133 - 136