A RESEARCH BED FOR UNIT SELECTION BASED TEXT TO SPEECH SYNTHESIS

被引:0
|
作者
Sarathy, K. Partha [1 ]
Ramakrishnan, A. G. [2 ]
机构
[1] Ctr Dev Telemat, Bangalore 560100, Karnataka, India
[2] Indian Inst Sci, Dept Elect Engn, Bangalore 560100, Karnataka, India
来源
2008 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY: SLT 2008, PROCEEDINGS | 2008年
关键词
speech synthesis; speech codecs; intelligibility; naturalness; perception;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The paper describes a modular, unit selection based TTS framework, which can be used as a research bed for developing TTS in any new language, as well as studying the effect of changing any parameter during synthesis. Using this framework, TTS has been developed for Tamil. Synthesis database consists of 1027 phonetically rich prerecorded sentences. This framework has already been tested for Kannada. Our TTS synthesizes intelligible and acceptably natural speech, as supported by high mean opinion scores. The framework is further optimized to suit embedded applications like mobiles and PDAs. We compressed the synthesis speech database with standard speech compression algorithms used in commercial GSM phones and evaluated the quality of the resultant synthesized sentences. Even with a highly compressed database, the synthesized output is perceptually close to that with uncompressed database. Through experiments, we explored the ambiguities in human perception when listening to Tamil phones and syllables uttered in isolation, thus proposing to exploit the misperception to substitute for missing phone contexts in the database. Listening experiments have been conducted on sentences synthesized by deliberately replacing phones with their confused ones.
引用
收藏
页码:229 / +
页数:2
相关论文
共 50 条
  • [1] COMPRESSED SENSING FOR UNIT SELECTION BASED SPEECH SYNTHESIS
    Sharma, Pulkit
    Abrol, Vinayak
    Sao, Anil Kumar
    2015 23RD EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO), 2015, : 1731 - 1735
  • [2] A unit selection text-to-speech-and-singing synthesis framework from neutral speech: proof of concept
    Marc Freixes
    Francesc Alías
    Joan Claudi Socoró
    EURASIP Journal on Audio, Speech, and Music Processing, 2019
  • [3] A unit selection text-to-speech-and-singing synthesis framework from neutral speech: proof of concept
    Freixes, Marc
    Alias, Francesc
    Claudi Socoro, Joan
    EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING, 2019, 2019 (01)
  • [4] An efficient unit-selection method for concatenative Text-to-speech synthesis systems
    Gros, Jerneja Zganec
    Zganec, Mario
    Journal of Computing and Information Technology, 2008, 16 (01) : 69 - 78
  • [5] A Unit Selection Text-to-Speech Synthesis System Optimized for Use with Screen Readers
    Chalamandaris, Aimilios
    Karabetsos, Sotiris
    Tsiakoulis, Pirros
    Raptis, Spyros
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2010, 56 (03) : 1890 - 1897
  • [6] Unit Selection based Speech Synthesis for Poor Channel Condition
    Cen, Ling
    Dong, Minghui
    Chan, Paul
    Li, Haizhou
    INTERSPEECH 2009: 10TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2009, VOLS 1-5, 2009, : 2035 - 2038
  • [7] A Comparison of Speaker-based and Utterance-based Data Selection for Text-to-Speech Synthesis
    Lee, Kai-Zhan
    Cooper, Erica
    Hirschberg, Julia
    19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES, 2018, : 2873 - 2877
  • [8] Minimum unit selection error training for HMM-based unit selection speech synthesis system
    Ling, Zhen-Hua
    Wang, Ren-Hua
    2008 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING, VOLS 1-12, 2008, : 3949 - 3952
  • [9] A Small Footprint Hybrid Statistical and Unit Selection Text-to-Speech Synthesis System for Turkish
    Guner, Ekrem
    Demiroglu, Cenk
    COMPUTER AND INFORMATION SCIENCES II, 2012, : 85 - 91
  • [10] Prominence-Based Prosody Prediction for Unit Selection Speech Synthesis
    Windmann, Andreas
    Jauk, Igor
    Tamburini, Fabio
    Wagner, Petra
    12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5, 2011, : 332 - +