On the Role of Spectral Dynamics in Unit Selection Speech Synthesis

被引:0
|
作者
Kirkpatrick, Barry [1 ]
O'Brien, Darragh [1 ]
Scaife, Ronan [1 ]
Errity, Andrew [1 ]
机构
[1] Dublin City Univ, Fac Engn & Comp, Res Inst Networks & Commun Engn, Dublin 9, Ireland
来源
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年
关键词
speech synthesis; join costs; auditory perception; spectral dynamics; feature extraction;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cost functions employed in unit selection significantly influence the quality of speech output. Although unit selection can produce very natural sounding speech the quality can be inconsistent and is difficult to guarantee due to discontinuities between incompatible units. The join cost employed in unit selection to measure the suitability of concatenating speech units typically consists of sub costs representing the fundamental frequency and spectrum at the boundaries of each unit. In this study the role of spectral dynamics as a join cost in unit selection synthesis is explored. A number of spectral dynamic measures are tested for the task of detecting discontinuities. Results indicate that spectral dynamic measures correlate with human perception of discontinuity if the features are extracted appropriately. Spectral dynamic mismatch is found to be a source of discontinuity although results suggest this is likely to occur simultaneously with static spectral mismatch.
引用
收藏
页码:2029 / 2032
页数:4
相关论文
共 50 条
  • [21] A RESEARCH BED FOR UNIT SELECTION BASED TEXT TO SPEECH SYNTHESIS
    Sarathy, K. Partha
    Ramakrishnan, A. G.
    2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS, 2008, : 229 - +
  • [22] Optimizing Phonetic Encoding for Viennese Unit Selection Speech Synthesis
    Pucher, Michael
    Neubarth, Friedrich
    Strom, Volker
    DEVELOPMENT OF MULTIMODAL INTERFACES: ACTIVE LISTING AND SYNCHRONY, 2010, 5967 : 207 - +
  • [23] Unit Selection based Speech Synthesis for Poor Channel Condition
    Cen, Ling
    Dong, Minghui
    Chan, Paul
    Li, Haizhou
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2035 - 2038
  • [24] OPTIMIZATION OF COST FUNCTION WEIGHTS FOR UNIT SELECTION SPEECH SYNTHESIS USING SPEECH RECOGNITION
    Pobar, Miran
    Martincic-Ipsic, Sanda
    Ipsic, Ivo
    NEURAL NETWORK WORLD, 2012, 22 (05) : 429 - 441
  • [25] Concatenative speech synthesis based on the plural unit selection and fusion method
    Mizutani, T
    Kagoshima, T
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (11): : 2565 - 2572
  • [26] Taking advantage of pronunciation variation in unit selection speech synthesis for Polish
    Janicki, Artur
    Meus, Piotr
    Topczewski, Maciej
    2008 3RD INTERNATIONAL SYMPOSIUM ON COMMUNICATIONS, CONTROL AND SIGNAL PROCESSING, VOLS 1-3, 2008, : 1133 - 1137
  • [27] Prominence-Based Prosody Prediction for Unit Selection Speech Synthesis
    Windmann, Andreas
    Jauk, Igor
    Tamburini, Fabio
    Wagner, Petra
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 332 - +
  • [28] Admissible Stopping in Viterbi Beam Search for Unit Selection Speech Synthesis
    Sakai, Shinsuke
    Kawahara, Tatsuya
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2013, E96D (06): : 1359 - 1367
  • [29] PROSODIC CONTROL OF UNIT-SELECTION SPEECH SYNTHESIS: A PROBABILISTIC APPROACH
    Veaux, Christophe
    Rodet, Xavier
    2011 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING, 2011, : 5360 - 5363
  • [30] SPEECH SEGMENT SELECTION FOR CONCATENATIVE SYNTHESIS BASED ON SPECTRAL DISTORTION MINIMIZATION
    IWAHASHI, N
    KAIKI, N
    SAGISAKA, Y
    IEICE TRANSACTIONS ON FUNDAMENTALS OF ELECTRONICS COMMUNICATIONS AND COMPUTER SCIENCES, 1993, E76A (11) : 1942 - 1948