A POLYNOMIAL SEGMENT MODEL BASED STATISTICAL PARAMETRIC SPEECH SYNTHESIS SYSTEM

被引：0

作者：

Sun, Jingwei ^{[1
]}

Ding, Feng ^{[1
]}

Wu, Yahui ^{[1
]}

机构：

[1] Nokia Res, Beijing, Peoples R China

来源：

2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年

关键词：

Hidden Markov Model; Polynomial Segment Model; statistical parametric speech synthesis; mean trajectory;

D O I：

暂无

中图分类号：

O42 [声学];

学科分类号：

070206 ; 082403 ;

摘要：

In this paper, we present a statistical parametric speech synthesis system based on the polynomial segment model (PSM). As one of the segmental models for speech signals, PSM explicitly describes the trajectory of the features in a speech segment, and keeps the internal dynamics of the segment. In this work, spectral and excitation parameters are modeled by PSMs simultaneously, while the duration for each segment is modeled by a single Gaussian distribution. A top-down K-means clustering technique is applied for model tying. Mean trajectories acquired from PSMs are used directly to generate speech parameters according to the estimated segment duration. An English speech synthesizer back-end is implemented on CMU Arctic corpus and the performance of the new approach is compared with the classical HMM-based one. Experimental results show that PSM modeling can achieve similar naturalness and intelligence of the synthetic speech as HMM modeling. The system is in the early stage of its development.

引用

页码：4021 / 4024

页数：4

共 50 条

[1] Harmonics Plus Noise Model Based Vocoder for Statistical Parametric Speech Synthesis
Erro, Daniel
Sainz, Inaki
Navas, Eva
Hernaez, Inma
IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 184 - 194
[2] Statistical parametric speech synthesis with a novel codebook-based excitation model
Csapo, Tamas Gabor
Nemeth, Geza
INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2014, 8 (04): : 289 - 299
[3] Statistical parametric speech synthesis
Black, Alan W.
Zen, Heiga
Tokuda, Keiichi
2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1229 - +
[4] Statistical parametric speech synthesis
Zen, Heiga
Tokuda, Keiichi
Black, Alan W.
SPEECH COMMUNICATION, 2009, 51 (11) : 1039 - 1064
[5] STATISTICAL PARAMETRIC SPEECH SYNTHESIS BASED ON PRODUCT OF EXPERTS
Zen, Heiga
Gales, Mark J. F.
Nankaku, Yoshihiko
Tokuda, Keiichi
2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4242 - 4245
[6] Statistical parametric speech synthesis using a hidden trajectory model
Cai, Ming-Qi
Ling, Zhen-Hua
Dai, Li-Rong
SPEECH COMMUNICATION, 2015, 72 : 149 - 159
[7] NEURAL SOURCE-FILTER-BASED WAVEFORM MODEL FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS
Wang, Xin
Takaki, Shinji
Yamagishi, Junichi
2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5916 - 5920
[8] Statistical Parametric Speech Synthesis: A Review
Aroon, Athira
Dhonde, S. B.
PROCEEDINGS OF 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO), 2015,
[9] Statistical parametric speech synthesis for Ibibio
Ekpenyong, Moses
Urua, Eno-Abasi
Watts, Oliver
King, Simon
Yamagishi, Junichi
SPEECH COMMUNICATION, 2014, 56 : 243 - 251
[10] An introduction to statistical parametric speech synthesis
King, Simon
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2011, 36 (05): : 837 - 852

← 1 2 3 4 5 →