A POLYNOMIAL SEGMENT MODEL BASED STATISTICAL PARAMETRIC SPEECH SYNTHESIS SYSTEM

被引:0
|
作者
Sun, Jingwei [1 ]
Ding, Feng [1 ]
Wu, Yahui [1 ]
机构
[1] Nokia Res, Beijing, Peoples R China
来源
2009 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOLS 1- 8, PROCEEDINGS | 2009年
关键词
Hidden Markov Model; Polynomial Segment Model; statistical parametric speech synthesis; mean trajectory;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
In this paper, we present a statistical parametric speech synthesis system based on the polynomial segment model (PSM). As one of the segmental models for speech signals, PSM explicitly describes the trajectory of the features in a speech segment, and keeps the internal dynamics of the segment. In this work, spectral and excitation parameters are modeled by PSMs simultaneously, while the duration for each segment is modeled by a single Gaussian distribution. A top-down K-means clustering technique is applied for model tying. Mean trajectories acquired from PSMs are used directly to generate speech parameters according to the estimated segment duration. An English speech synthesizer back-end is implemented on CMU Arctic corpus and the performance of the new approach is compared with the classical HMM-based one. Experimental results show that PSM modeling can achieve similar naturalness and intelligence of the synthetic speech as HMM modeling. The system is in the early stage of its development.
引用
收藏
页码:4021 / 4024
页数:4
相关论文
共 50 条
  • [1] Harmonics Plus Noise Model Based Vocoder for Statistical Parametric Speech Synthesis
    Erro, Daniel
    Sainz, Inaki
    Navas, Eva
    Hernaez, Inma
    IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, 2014, 8 (02) : 184 - 194
  • [2] Statistical parametric speech synthesis with a novel codebook-based excitation model
    Csapo, Tamas Gabor
    Nemeth, Geza
    INTELLIGENT DECISION TECHNOLOGIES-NETHERLANDS, 2014, 8 (04): : 289 - 299
  • [3] Statistical parametric speech synthesis
    Black, Alan W.
    Zen, Heiga
    Tokuda, Keiichi
    2007 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, VOL IV, PTS 1-3, 2007, : 1229 - +
  • [4] Statistical parametric speech synthesis
    Zen, Heiga
    Tokuda, Keiichi
    Black, Alan W.
    SPEECH COMMUNICATION, 2009, 51 (11) : 1039 - 1064
  • [5] STATISTICAL PARAMETRIC SPEECH SYNTHESIS BASED ON PRODUCT OF EXPERTS
    Zen, Heiga
    Gales, Mark J. F.
    Nankaku, Yoshihiko
    Tokuda, Keiichi
    2010 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2010, : 4242 - 4245
  • [6] Statistical parametric speech synthesis using a hidden trajectory model
    Cai, Ming-Qi
    Ling, Zhen-Hua
    Dai, Li-Rong
    SPEECH COMMUNICATION, 2015, 72 : 149 - 159
  • [7] NEURAL SOURCE-FILTER-BASED WAVEFORM MODEL FOR STATISTICAL PARAMETRIC SPEECH SYNTHESIS
    Wang, Xin
    Takaki, Shinji
    Yamagishi, Junichi
    2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2019, : 5916 - 5920
  • [8] Statistical Parametric Speech Synthesis: A Review
    Aroon, Athira
    Dhonde, S. B.
    PROCEEDINGS OF 2015 IEEE 9TH INTERNATIONAL CONFERENCE ON INTELLIGENT SYSTEMS AND CONTROL (ISCO), 2015,
  • [9] Statistical parametric speech synthesis for Ibibio
    Ekpenyong, Moses
    Urua, Eno-Abasi
    Watts, Oliver
    King, Simon
    Yamagishi, Junichi
    SPEECH COMMUNICATION, 2014, 56 : 243 - 251
  • [10] An introduction to statistical parametric speech synthesis
    King, Simon
    SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES, 2011, 36 (05): : 837 - 852