IMPROVED TIME-FREQUENCY TRAJECTORY EXCITATION MODELING FOR A STATISTICAL PARAMETRIC SPEECH SYNTHESIS SYSTEM

被引：0

作者：

Song, Eunwoo ^{[1
]}

Joo, Young-Sun ^{[1
]}

Kang, Hong-Goo ^{[1
]}

机构：

[1] Yonsei Univ, Dept Elect & Elect Engn, Seoul, South Korea

来源：

2015 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING (ICASSP) | 2015年

关键词：

Statistical parametric speech synthesis; time-frequency trajectory excitation (TFTE); slowly evolving waveform (SEW); predicted average block coefficient (PABC);

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

This paper proposes an improved time-frequency trajectory excitation (TFTE) modeling method for a statistical parametric speech synthesis system. The proposed approach overcomes the dimensional variation problem of the training process caused by the inherent nature of the pitch-dependent analysis paradigm. By reducing the redundancies of the parameters using predicted average block coefficients (PABC), the proposed algorithm efficiently models excitation, even if its dimension is varied. Objective and subjective test results verify that the proposed algorithm provides not only robustness to the training process but also naturalness to the synthesized speech.

引用

页码：4949 / 4953

页数：5

共 16 条

[1]

Choy E., 1998, WAVEFORM INTERPOLATI

[2]

Hardwick J. C., 1988, ICASSP 88: 1988 International Conference on Acoustics, Speech, and Signal Processing (Cat. No.88CH2561-9), P374, DOI 10.1109/ICASSP.1988.196595

[3] Waveform Interpolation-Based Speech Analysis/Synthesis for HMM-Based TTS Systems [J].

Jung, Chi-Sang ;

Joo, Young-Sun ;

Kang, Hong-Goo .

IEEE SIGNAL PROCESSING LETTERS, 2012, 19 (12) :809-812

[4]

Kang SY, 2013, INT CONF ACOUST SPEE, P8012, DOI 10.1109/ICASSP.2013.6639225

[5]

Kawahara K., 1997, 1997 IEEE INT C AC S

[6] Modeling Spectral Envelopes Using Restricted Boltzmann Machines and Deep Belief Networks for Statistical Parametric Speech Synthesis [J].

Ling, Zhen-Hua ;

Deng, Li ;

Yu, Dong .

IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2013, 21 (10) :2129-2139

[7]

MCCREE A, 1995, IEEE T SPEECH AUDIO, V3

[8] DYNAMIC-PROGRAMMING ALGORITHM OPTIMIZATION FOR SPOKEN WORD RECOGNITION [J].

SAKOE, H ;

CHIBA, S .

IEEE TRANSACTIONS ON ACOUSTICS SPEECH AND SIGNAL PROCESSING, 1978, 26 (01) :43-49

[9]

Sung J., 2010, INTERSPEECH

[10]

Tokuda Keiichi., 2006, HMM BASED SPEECH SYN

← 1 2 →