LSM-based unit pruning for concatenative speech synthesis

被引:0
|
作者
Bellegarda, Jerome R. [1 ]
机构
[1] Apple Comp Inc, Speech & Language Technol, Cupertino, CA 95014 USA
来源
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3 | 2007年
关键词
text-to-speech synthesis; unit selection; inventory pruning; outlier removal; unit redundancy management;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
The level of quality that can be achieved in concatenative text-to-speech synthesis is primarily governed by the inventory of units used in unit selection. This has led to the collection of ever larger corpora in the quest for ever more natural synthetic speech. As operational considerations limit the size of the unit inventory, however, pruning is critical to removing any instances that prove either spurious or superfluous. This paper proposes a novel pruning strategy based on a data-driven feature extraction framework separately optimized for each unit type in the inventory. A single distinctiveness/redundancy measure can then address, in a consistent manner, the (traditionally separate) problems of outliers and redundant units. Experimental results underscore the viability of this approach for both moderate and aggressive inventory pruning.
引用
收藏
页码:521 / 524
页数:4
相关论文
共 50 条
  • [31] Unit Selection Model in Arabic Speech Synthesis
    Al-Saiyd, Nedhal A.
    Hijjawi, Mohammad
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2018, 18 (04): : 126 - 131
  • [32] Polish unit selection speech synthesis with BOSS: extensions and speech corpora
    Demenko, Grazyna
    Klessa, Katarzyna
    Szymanski, Marcin
    Breuer, Stefan
    Hess, Wolfgang
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2010, 13 (02) : 85 - 99
  • [33] Syllable specific unit selection cost functions for text-to-speech synthesis
    Narendra, N.P.
    Sreenivasa Rao, K.
    ACM Transactions on Speech and Language Processing, 2012, 9 (03):
  • [34] Unit Selection Speech Synthesis Using Frame-Sized Speech Segments and Neural Network Based Acoustic Models
    Zhen-Hua Ling
    Zhi-Ping Zhou
    Journal of Signal Processing Systems, 2018, 90 : 1053 - 1062
  • [35] Unit Selection Speech Synthesis Using Frame-Sized Speech Segments and Neural Network Based Acoustic Models
    Ling, Zhen-Hua
    Zhou, Zhi-Ping
    JOURNAL OF SIGNAL PROCESSING SYSTEMS FOR SIGNAL IMAGE AND VIDEO TECHNOLOGY, 2018, 90 (07): : 1053 - 1062
  • [36] Development and Evaluation of Polish Speech Corpus for Unit Selection Speech Synthesis Systems
    Demenko, G.
    Bachan, J.
    Moebius, B.
    Klessa, K.
    Szymanski, M.
    Grocholewski, S.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1650 - +
  • [37] Learning and Modeling Unit Embeddings Using Deep Neural Networks for Unit-Selection-Based Mandarin Speech Synthesis
    Zhou, Xiao
    Ling, Zhen-Hua
    Dai, Li-Rong
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (03)
  • [38] Joint Prosodic and Segmental Unit Selection Speech Synthesis
    Clark, Robert A. J.
    King, Simon
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1312 - 1315
  • [39] The Target Cost Formulation in Unit Selection Speech Synthesis
    Taylor, Paul
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2038 - 2041
  • [40] On the Impact of Labialization Contexts on Unit Selection Speech Synthesis
    Tihelka, Daniel
    Hanzlicek, Zdenek
    Machac, Pavel
    Skarnitzl, Radek
    Matousek, Jindrich
    2012 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2012, : 187 - 192