On the Role of Spectral Dynamics in Unit Selection Speech Synthesis

被引:0
|
作者
Kirkpatrick, Barry [1 ]
O'Brien, Darragh [1 ]
Scaife, Ronan [1 ]
Errity, Andrew [1 ]
机构
[1] Dublin City Univ, Fac Engn & Comp, Res Inst Networks & Commun Engn, Dublin 9, Ireland
来源
INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4 | 2007年
关键词
speech synthesis; join costs; auditory perception; spectral dynamics; feature extraction;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Cost functions employed in unit selection significantly influence the quality of speech output. Although unit selection can produce very natural sounding speech the quality can be inconsistent and is difficult to guarantee due to discontinuities between incompatible units. The join cost employed in unit selection to measure the suitability of concatenating speech units typically consists of sub costs representing the fundamental frequency and spectrum at the boundaries of each unit. In this study the role of spectral dynamics as a join cost in unit selection synthesis is explored. A number of spectral dynamic measures are tested for the task of detecting discontinuities. Results indicate that spectral dynamic measures correlate with human perception of discontinuity if the features are extracted appropriately. Spectral dynamic mismatch is found to be a source of discontinuity although results suggest this is likely to occur simultaneously with static spectral mismatch.
引用
收藏
页码:2029 / 2032
页数:4
相关论文
共 50 条
  • [1] PREDICTING SPECTRAL AND PROSODIC PARAMETERS FOR UNIT SELECTION IN SPEECH SYNTHESIS
    Dong, Minghui
    Li, Haizhou
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 133 - 136
  • [2] Spectral dynamics as a source of discontinuity in concatenative speech synthesis
    Kirkpatrick, Barry
    O'Brien, Darragh
    Scaife, Ronan
    Errity, Andrew
    PROCEEDINGS OF THE 2007 15TH INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING, 2007, : 615 - +
  • [3] Assessing a Speaker for Fast Speech in Unit Selection Speech Synthesis
    Moers, Donata
    Wagner, Petra
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2015 - +
  • [4] One-Class Classification for Spectral Join Cost Calculation in Unit Selection Speech Synthesis
    Karabetsos, Sotiris
    Tsiakoulis, Pirros
    Chalamandaris, Aimilios
    Raptis, Spyros
    IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (08) : 746 - 749
  • [5] Optimal Utterance Selection for Unit Selection Speech Synthesis Databases
    Alan W. Black
    Kevin Lenzo
    International Journal of Speech Technology, 2003, 6 (4) : 357 - 363
  • [6] Control of spectral dynamics in concatenative speech synthesis
    Wouters, J
    Macon, MW
    IEEE TRANSACTIONS ON SPEECH AND AUDIO PROCESSING, 2001, 9 (01): : 30 - 38
  • [7] Polish unit selection speech synthesis with BOSS: extensions and speech corpora
    Demenko, Grazyna
    Klessa, Katarzyna
    Szymanski, Marcin
    Breuer, Stefan
    Hess, Wolfgang
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2010, 13 (02) : 85 - 99
  • [8] Joint Prosodic and Segmental Unit Selection Speech Synthesis
    Clark, Robert A. J.
    King, Simon
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1312 - 1315
  • [9] COMPRESSED SENSING FOR UNIT SELECTION BASED SPEECH SYNTHESIS
    Sharma, Pulkit
    Abrol, Vinayak
    Sao, Anil Kumar
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1731 - 1735
  • [10] The Target Cost Formulation in Unit Selection Speech Synthesis
    Taylor, Paul
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2038 - 2041