A Dynamic Cost Weighting Framework for Unit Selection Text-to-Speech Synthesis

被引：9

作者：

Bellegarda, Jerome R. ^{[1
]}

机构：

[1] Apple Comp Inc, Speech & Language Technol, Cupertino, CA 95014 USA

来源：

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING | 2010年 / 18卷 / 06期

关键词：

Candidate ranking; concatenation-specific cost weighting; concatenative speech synthesis; multiple information streams; unit selection;

D O I：

10.1109/TASL.2009.2035209

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

Unit selection text-to-speech synthesis relies on multiple cost criteria, each encapsulating a different aspect of acoustic and prosodic context at any given concatenation point. Constraints are normally invoked on diverse characteristics such as inter-unit discontinuity, overall pitch contour, local duration profile, etc., leading to costs often too heterogeneous for a direct quantitative comparison. In order to rank available candidate units, this complexity must be reduced to a single number, and the relative importance of each information stream becomes highly critical. Yet this influence is typically determined in an empirical manner (e. g., based on a limited amount of synthesized data), yielding global weights that are thus applied to broad classes of concatenations indiscriminately. This paper proposes an alternative approach, dynamic cost weighting, based on a data-driven framework separately optimized for each concatenation considered. Specifically, the cost distribution in every stream is dynamically leveraged on a per concatenation basis to locally shift weight towards those characteristics that offer a high discrimination between candidate units, and away from those characteristics that are intrinsically less discriminative. An illustrative case study demonstrates the potential benefits of this solution, and listening evidence suggests that it does indeed entail higher perceived TTS quality.

引用

页码：1455 / 1463

页数：9

共 50 条

[1] An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System
Tsiakoulis, Pirros
Karabetsos, Sotiris
Chalamandaris, Aimilios
Raptis, Spyros
ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS, 2014, 8445 : 370 - 383
[2] A global, boundary-centric framework for unit selection text-to-speech synthesis
Bellegarda, JR
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2006, 14 (03): : 990 - 997
[3] Embedded Unit Selection Text-to-Speech Synthesis for Mobile Devices
Karabetsos, Sotiris
Tsiakoulis, Pirros
Chalamandaris, Aimilios
Raptis, Spyros
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2009, 55 (02) : 613 - 621
[4] Continuity Metric for Unit Selection based Text-to-Speech Synthesis
Lakkavalli, Vikram Ramesh
Arulmozhi, P.
Ramakrishnan, A. G.
2010 INTERNATIONAL CONFERENCE ON SIGNAL PROCESSING AND COMMUNICATIONS (SPCOM), 2010,
[5] Unit-centric feature mapping for inventory pruning in unit selection text-to-speech synthesis
Bellegarda, Jerome R.
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (01): : 74 - 82
[6] High quality Arabic text-to-speech synthesis using unit selection
Abdelmalek, Raja
Mnasri, Zied
2016 13TH INTERNATIONAL MULTI-CONFERENCE ON SYSTEMS, SIGNALS & DEVICES (SSD), 2016, : 1 - 5
[7] Optimal weight tuning method for unit selection cost functions in syllable based text-to-speech synthesis
Narendra, N. P.
Rao, K. Sreenivasa
APPLIED SOFT COMPUTING, 2013, 13 (02) : 773 - 781
[8] PERCEPTUAL EVALUATION OF DYNAMIC COST WEIGHTING FOR UNIT SELECTION TTS
Bellegarda, Jerome R.
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4806 - 4809
[9] A Unit Selection Text-to-Speech Synthesis System Optimized for Use with Screen Readers
Chalamandaris, Aimilios
Karabetsos, Sotiris
Tsiakoulis, Pirros
Raptis, Spyros
IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2010, 56 (03) : 1890 - 1897
[10] PERCEPTUAL CLUSTERING BASED UNIT SELECTION OPTIMIZATION FOR CONCATENATIVE TEXT-TO-SPEECH SYNTHESIS
Jiang, Tao
Wu, Zhiyong
Jia, Jia
Cai, Lianhong
2012 8TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, 2012, : 64 - 68

← 1 2 3 4 5 →