A RESEARCH BED FOR UNIT SELECTION BASED TEXT TO SPEECH SYNTHESIS

被引:0
|
作者
Sarathy, K. Partha [1 ]
Ramakrishnan, A. G. [2 ]
机构
[1] Ctr Dev Telemat, Bangalore 560100, Karnataka, India
[2] Indian Inst Sci, Dept Elect Engn, Bangalore 560100, Karnataka, India
来源
2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS | 2008年
关键词
speech synthesis; speech codecs; intelligibility; naturalness; perception;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper describes a modular, unit selection based TTS framework, which can be used as a research bed for developing TTS in any new language, as well as studying the effect of changing any parameter during synthesis. Using this framework, TTS has been developed for Tamil. Synthesis database consists of 1027 phonetically rich prerecorded sentences. This framework has already been tested for Kannada. Our TTS synthesizes intelligible and acceptably natural speech, as supported by high mean opinion scores. The framework is further optimized to suit embedded applications like mobiles and PDAs. We compressed the synthesis speech database with standard speech compression algorithms used in commercial GSM phones and evaluated the quality of the resultant synthesized sentences. Even with a highly compressed database, the synthesized output is perceptually close to that with uncompressed database. Through experiments, we explored the ambiguities in human perception when listening to Tamil phones and syllables uttered in isolation, thus proposing to exploit the misperception to substitute for missing phone contexts in the database. Listening experiments have been conducted on sentences synthesized by deliberately replacing phones with their confused ones.
引用
收藏
页码:229 / +
页数:2
相关论文
共 50 条
  • [31] Development and Evaluation of Polish Speech Corpus for Unit Selection Speech Synthesis Systems
    Demenko, G.
    Bachan, J.
    Moebius, B.
    Klessa, K.
    Szymanski, M.
    Grocholewski, S.
    INTERSPEECH 2008: 9TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2008, VOLS 1-5, 2008, : 1650 - +
  • [32] Reducing footprint of unit selection based text-to-speech system using compressed sensing and sparse representation
    Sharma, Pulkit
    Abrol, Vinayak
    Nivedita
    Sao, Anil Kumar
    COMPUTER SPEECH AND LANGUAGE, 2018, 52 : 191 - 208
  • [33] Phone-Level Embeddings for Unit Selection Speech Synthesis
    Perquin, Antoine
    Lecorve, Gwenole
    Lolive, Damien
    Amsaleg, Laurent
    STATISTICAL LANGUAGE AND SPEECH PROCESSING, SLSP 2018, 2018, 11171 : 21 - 31
  • [34] On the Impact of Annotation Errors on Unit-Selection Speech Synthesis
    Matousek, Jindrich
    Tihelka, Daniel
    Smidl, Lubos
    TEXT, SPEECH AND DIALOGUE, TSD 2012, 2012, 7499 : 456 - 463
  • [35] Unifying Unit Selection and Hidden Markov Model Speech Synthesis
    Taylor, Paul
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1758 - 1761
  • [36] Trainable unit selection speech synthesis under statistical framework
    WANG RenHua
    Science Bulletin, 2009, (11) : 1963 - 1969
  • [37] Trainable unit selection speech synthesis under statistical framework
    Wang RenHua
    Dai LiRong
    Ling ZhenHua
    Hu Yu
    CHINESE SCIENCE BULLETIN, 2009, 54 (11): : 1963 - 1969
  • [38] FarsBayan: A Unit Selection based Farsi Speech Synthesizer
    Homayounpour, M. Mehdi
    Namnabat, Majid
    INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5, 2006, : 1336 - 1339
  • [39] Optimizing Phonetic Encoding for Viennese Unit Selection Speech Synthesis
    Pucher, Michael
    Neubarth, Friedrich
    Strom, Volker
    DEVELOPMENT OF MULTIMODAL INTERFACES: ACTIVE LISTING AND SYNCHRONY, 2010, 5967 : 207 - +
  • [40] PREDICTING SPECTRAL AND PROSODIC PARAMETERS FOR UNIT SELECTION IN SPEECH SYNTHESIS
    Dong, Minghui
    Li, Haizhou
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 133 - 136