An Investigation of Convolution Attention Based Models for Multilingual Speech Synthesis of Indian Languages

被引:7
作者
Baljekar, Pallavi [1 ]
Rallabandi, SaiKrishna [1 ]
Black, Alan W. [1 ]
机构
[1] Carnegie Mellon Univ, Pittsburgh, PA 15213 USA
来源
19TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2018), VOLS 1-6: SPEECH RESEARCH FOR EMERGING MARKETS IN MULTILINGUAL SOCIETIES | 2018年
关键词
End-to-end Synthesis; Convolutional Model; Neural Networks; Indic Languages;
D O I
10.21437/Interspeech.2018-1869
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper we investigate multi-speaker, multi-lingual speech synthesis for 4 Indic languages (Hindi, Marathi, Gujarathi, Bengali) as well as English in a fully convolutional attention based model. We show how factored embeddings can allow cross lingual transfer, and investigate methods to adapt the model in a low resource scenario for the case of Marathi and Gujarati. We also show results on how effectively the model scales to a new language and how much data is required to train the system on a new language.
引用
收藏
页码:2474 / 2478
页数:5
相关论文
共 23 条
  • [1] [Anonymous], P INTERSPEECH
  • [2] [Anonymous], P INTERSPEECH
  • [3] [Anonymous], P INTERSPEECH 2016
  • [4] [Anonymous], 2017, INT C LEARNING REPRE
  • [5] [Anonymous], P INT C AC SPEECH SI
  • [6] [Anonymous], P IEEE INT C AC SPEE
  • [7] [Anonymous], 2008, COMP MAMMOGRAPHIC PA
  • [8] [Anonymous], 2017, P INT C MACH LEARN I
  • [9] [Anonymous], P LANG RES EV C LREC
  • [10] [Anonymous], 2018, P INT C AC SPEECH SI