A RESEARCH BED FOR UNIT SELECTION BASED TEXT TO SPEECH SYNTHESIS

被引:0
|
作者
Sarathy, K. Partha [1 ]
Ramakrishnan, A. G. [2 ]
机构
[1] Ctr Dev Telemat, Bangalore 560100, Karnataka, India
[2] Indian Inst Sci, Dept Elect Engn, Bangalore 560100, Karnataka, India
来源
2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS | 2008年
关键词
speech synthesis; speech codecs; intelligibility; naturalness; perception;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper describes a modular, unit selection based TTS framework, which can be used as a research bed for developing TTS in any new language, as well as studying the effect of changing any parameter during synthesis. Using this framework, TTS has been developed for Tamil. Synthesis database consists of 1027 phonetically rich prerecorded sentences. This framework has already been tested for Kannada. Our TTS synthesizes intelligible and acceptably natural speech, as supported by high mean opinion scores. The framework is further optimized to suit embedded applications like mobiles and PDAs. We compressed the synthesis speech database with standard speech compression algorithms used in commercial GSM phones and evaluated the quality of the resultant synthesized sentences. Even with a highly compressed database, the synthesized output is perceptually close to that with uncompressed database. Through experiments, we explored the ambiguities in human perception when listening to Tamil phones and syllables uttered in isolation, thus proposing to exploit the misperception to substitute for missing phone contexts in the database. Listening experiments have been conducted on sentences synthesized by deliberately replacing phones with their confused ones.
引用
收藏
页码:229 / +
页数:2
相关论文
共 50 条
  • [21] Unit-Selection Speech Synthesis Adjustments for Audiobook-Based Voices
    Vit, Jakub
    Matousek, Jindrich
    TEXT, SPEECH, AND DIALOGUE, 2016, 9924 : 335 - 342
  • [22] Polish unit selection speech synthesis with BOSS: extensions and speech corpora
    Demenko, Grazyna
    Klessa, Katarzyna
    Szymanski, Marcin
    Breuer, Stefan
    Hess, Wolfgang
    INTERNATIONAL JOURNAL OF SPEECH TECHNOLOGY, 2010, 13 (02) : 85 - 99
  • [23] Joint Prosodic and Segmental Unit Selection Speech Synthesis
    Clark, Robert A. J.
    King, Simon
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1312 - 1315
  • [24] The Target Cost Formulation in Unit Selection Speech Synthesis
    Taylor, Paul
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 2038 - 2041
  • [25] A method for combining intonation modelling and speech unit selection in corpus-based speech synthesis systems
    Diaz, Francisco Campillo
    Rodriguez Banga, Eduardo
    SPEECH COMMUNICATION, 2006, 48 (08) : 941 - 956
  • [26] On the Role of Spectral Dynamics in Unit Selection Speech Synthesis
    Kirkpatrick, Barry
    O'Brien, Darragh
    Scaife, Ronan
    Errity, Andrew
    INTERSPEECH 2007: 8TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION, VOLS 1-4, 2007, : 2029 - 2032
  • [27] On the Impact of Labialization Contexts on Unit Selection Speech Synthesis
    Tihelka, Daniel
    Hanzlicek, Zdenek
    Machac, Pavel
    Skarnitzl, Radek
    Matousek, Jindrich
    2012 IEEE INTERNATIONAL SYMPOSIUM ON SIGNAL PROCESSING AND INFORMATION TECHNOLOGY (ISSPIT), 2012, : 187 - 192
  • [28] Towards Intonation Control in Unit Selection Speech Synthesis
    Boidin, Cedric
    Boeffard, Olivier
    Moudenc, Thierry
    Damnati, Geraldine
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 736 - +
  • [29] Expressive Prosody for Unit-selection Speech Synthesis
    Strom, Volker
    Clark, Robert
    King, Simon
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1296 - 1299
  • [30] Intensity Modeling for Syllable Based Text-to-Speech Synthesis
    Reddy, V. Ramu
    Rao, K. Sreenivasa
    CONTEMPORARY COMPUTING, 2012, 306 : 106 - 117