BOOTSTRAPPING TEXT-TO-SPEECH FOR SPEECH PROCESSING IN LANGUAGES WITHOUT AN ORTHOGRAPHY

被引:0
作者
Sitaram, Sunayana [1 ]
Palkar, Sukhada [1 ]
Chen, Yun-Nung [1 ]
Parlikar, Alok [1 ]
Black, Alan W. [1 ]
机构
[1] Carnegie Mellon Univ, Language Technol Inst, Pittsburgh, PA 15213 USA
来源
2013 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) | 2013年
关键词
Speech Synthesis; Synthesis without Text; Languages without an Orthography;
D O I
暂无
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
Speech synthesis technology has reached the stage where given a well-designed corpus of audio and accurate transcription an at least understandable synthesizer can be built without necessarily resorting to new innovations. However many languages do not have a well-defined writing system but such languages could still greatly benefit from speech systems. In this paper we consider the case where we have a (potentially large) single speaker database but have no transcriptions and no standardized way to write transcriptions. To address this scenario we propose a method that allows us to bootstrap synthetic voices purely from speech data. We use a novel combination of automatic speech recognition and automatic word segmentation for the bootstrapping. Our experimental results on speech corpora in two languages, English and German, show that synthetic voices that are built using this method are close to understandable. Our method is language-independent and can thus be used to build synthetic voices from a speech corpus in any new language.
引用
收藏
页码:7992 / 7996
页数:5
相关论文
共 28 条
[1]  
Aguero Pablo Daniel, 2006, P IEEE INT C AC SPEE
[2]  
Ahmed Zeeshan, 2012, P 10 BIENN C ASS MAC
[3]  
[Anonymous], 2001, PROC 18 INT C MACH L
[4]  
Bertoldi N, 2007, INT CONF ACOUST SPEE, P1297
[5]   Towards speech translation of non written languages [J].
Besacier, Laurent ;
Zhou, Bowen ;
Gao, Yuqing .
2006 IEEE SPOKEN LANGUAGE TECHNOLOGY WORKSHOP, 2006, :222-+
[6]  
Black A., 1997, FESTIVAL SPEECH SYNT
[7]  
Black A. W., 2006, INTERSPEECH, P194
[8]  
Clark A, 2003, EACL 2003: 10TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE, P59
[9]  
Elsner Micha, 2012, P ASS COMP LING JEJ
[10]  
Garofalo John, 1993, CSR 1 WSJ0 COMPLETE