An Investigation of Convolution Attention Based Models for Multilingual Speech Synthesis of Indian Languages

被引:7
作者
Baljekar, Pallavi [1 ]
Rallabandi, SaiKrishna [1 ]
Black, Alan W. [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
End-to-end Synthesis; Convolutional Model; Neural Networks; Indic Languages;
D O I
10.21437/Interspeech.2018-1869
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we investigate multi-speaker, multi-lingual speech synthesis for 4 Indic languages (Hindi, Marathi, Gujarathi, Bengali) as well as English in a fully convolutional attention based model. We show how factored embeddings can allow cross lingual transfer, and investigate methods to adapt the model in a low resource scenario for the case of Marathi and Gujarati. We also show results on how effectively the model scales to a new language and how much data is required to train the system on a new language.
引用
收藏
页码:2474 / 2478
页数:5
相关论文
共 23 条
[1]  
[Anonymous], P INTERSPEECH
[2]  
[Anonymous], P INTERSPEECH
[3]  
[Anonymous], P INTERSPEECH 2016
[4]  
[Anonymous], 2017, INT C LEARNING REPRE
[5]  
[Anonymous], P INT C AC SPEECH SI
[6]  
[Anonymous], P IEEE INT C AC SPEE
[7]  
[Anonymous], 2008, COMP MAMMOGRAPHIC PA
[8]  
[Anonymous], 2017, P INT C MACH LEARN I
[9]  
[Anonymous], P LANG RES EV C LREC
[10]  
[Anonymous], 2018, P INT C AC SPEECH SI