An Overview of the ILSP Unit Selection Text-to-Speech Synthesis System

被引:0
作者
Tsiakoulis, Pirros [1 ]
Karabetsos, Sotiris [1 ]
Chalamandaris, Aimilios [1 ]
Raptis, Spyros [1 ]
机构
[1] Res Ctr Athena, Inst Language & Speech Proc, GR-15125 Athens, Greece
来源
ARTIFICIAL INTELLIGENCE: METHODS AND APPLICATIONS | 2014年 / 8445卷
关键词
Text to speech; unit selection; TTS; concatenative speech synthesis; COST;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This paper presents an overview of the Text-to-Speech synthesis system developed at the Institute for Language and Speech Processing (ILSP). It focuses on the key issues regarding the design of the system components. The system currently fully supports three languages (Greek, English, Bulgarian) and is designed in such a way to be as language and speaker independent as possible. Also, experimental results are presented which show that the system produces high quality synthetic speech in terms of naturalness and intelligibility. The system was recently ranked among the first three systems worldwide in terms of achieved quality for the English language, at the international Blizzard Challenge 2013 workshop.
引用
收藏
页码:370 / 383
页数:14
相关论文
共 13 条
[1]  
Benesty J., 2008, SPRINGER HDB SPEECH
[2]  
Chalamandaris A., 2005, P INT 2005 9 EUR C S
[3]  
Chalamandaris A., 2009, P IEEE ICSIPA 2009 I
[4]  
Chalamandaris A., 2013, P BLIZZ CHALL 2013 W
[5]  
Dutoit T., 2008, SPRINGER HDB SPEECH, P437
[6]   Speech and language processing over the web - Changing the way people communicate and access information [J].
Gilbert, Mazin ;
Feng, Junlan .
IEEE SIGNAL PROCESSING MAGAZINE, 2008, 25 (03) :18-28
[7]   One-Class Classification for Spectral Join Cost Calculation in Unit Selection Speech Synthesis [J].
Karabetsos, Sotiris ;
Tsiakoulis, Pirros ;
Chalamandaris, Aimilios ;
Raptis, Spyros .
IEEE SIGNAL PROCESSING LETTERS, 2010, 17 (08) :746-749
[8]   The contribution of various sources of spectral mismatch to audible discontinuities in a diphone database [J].
Klabbers, Esther ;
van Santen, Jan P. H. ;
Kain, Alexander .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2007, 15 (03) :949-956
[9]  
Li D., 2005, IEEE SIGNAL PROCESSI, V22, P12
[10]  
Marc S., 2009, AFFECTIVE INFORM PRO