Production of filled pauses in concatenative speech synthesis based on the underlying fluent sentence

被引:24
作者
Adell, Jordi [1 ]
Escudero, David [2 ]
Bonafonte, Antonio [1 ]
机构
[1] Univ Politecn Cataluna, Barcelona, Spain
[2] Univ Valladolid, Valladolid, Spain
关键词
Speech synthesis; Conversational speech; Talking speech synthesiser; Filled pause; Disfluency; Underlying fluent sentence; Prosody; Ogmios; Perceptual evaluation; UM; UH; REPAIR; CORPUS;
D O I
10.1016/j.specom.2011.10.010
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Until now, speech synthesis has mainly involved reading-style speech. Today, however, text-to-speech systems must provide a variety of styles because users expect these interfaces to do more than just read information. If synthetic voices must be integrated into future technology, they must simulate the way people talk instead of the way people read. Existing knowledge about how disfluencies occur has made it possible to propose a general framework for synthesising disfluencies. We propose a model based on the definition of disfluency and the concept of underlying fluent sentences. The model incorporates the parameters of standard prosodic models for fluent speech with local modifications of prosodic parameters near the interruption point. The constituents of the local models for filled pauses are derived from the analysis corpus, and constituent's prosodic parameters are predicted via linear regression analysis. We also discuss the implementation details of the model when used in a real speech synthesis system. Objective and perceptual evaluations showed that the proposed models outperformed the baseline model. Perceptual evaluations of the system showed that it is possible to synthesise filled pauses without decreasing the overall naturalness of the system, and users stated that the speech produced is even more natural than the one produced without filled pauses. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:459 / 476
页数:18
相关论文
共 83 条
  • [11] [Anonymous], LECT NOTES ARTIF INT
  • [12] [Anonymous], P 8 JORN TEL I D TEL
  • [13] [Anonymous], TTS PROGR REPORT DEL
  • [14] [Anonymous], P EUR GEN SWITZ
  • [15] [Anonymous], P 37 ANN M ASS COMP
  • [16] [Anonymous], P INT C SPEECH LANG
  • [17] [Anonymous], P LREC 2000
  • [18] [Anonymous], P INT C LANG RES EV
  • [19] [Anonymous], P EUR RHOD GREEC
  • [20] [Anonymous], P 13 ICPHS STOCKH SW