English to Sinhala Neural Machine Translation

被引:0
作者
Fonseka, Thilakshi [1 ]
Naranpanawa, Rashmini [1 ]
Perera, Ravinga [1 ]
Thayasivam, Uthayasanker [1 ]
机构
[1] Univ Moratuwa, Dept Comp Sci & Engn, Katubedda 10400, Sri Lanka
来源
2020 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING (IALP 2020) | 2020年
关键词
Neural Machine Translation (NMT); Low resource; Domain specific;
D O I
10.1109/ialp51396.2020.9310462
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Neural Machine Translation (NMT) is the current state-of-the-art machine translation technique available among all other techniques. It has indeed shown promising results for resourceful languages. However, NMT heavily underperforms in low resource settings. English proficiency of Sri Lankans is very low and only a handful of people can understand and speak English. Since Sinhala language is the most widely used language in Sri Lanka, there is a huge demand for quality English to Sinhala translations in order to share knowledge among locals. Sinhala is a language with different morphology and syntax compared to English. Hence, translating English text to Sinhala is immensely challenging. In this paper, we introduce an effective NMT system along with Byte Pair Encoding (BPE) for the English-Sinhala language pair focusing on the Sri Lankan official government documents.
引用
收藏
页码:305 / 309
页数:5
相关论文
共 23 条
  • [1] [Anonymous], 2015, P INT C LEARN REPR
  • [2] [Anonymous], 2014, INT C LEARN REPR ICL, Patent No. 13126114
  • [3] CHOUDHARY H, 2018, P 3 C MACH TRANSL SH, P770
  • [4] Farhath F, 2018, 2018 MORATUWA ENGINEERING RESEARCH CONFERENCE (MERCON) 4TH INTERNATIONAL MULTIDISCIPLINARY ENGINEERING RESEARCH CONFERENCE, P538, DOI 10.1109/MERCon.2018.8421901
  • [5] Gage P., 1994, The C Users Journal, P23, DOI DOI 10.5555/177910.177914
  • [6] Graves A, 2012, STUD COMPUT INTELL, V385, P1, DOI [10.1162/neco.1997.9.1.1, 10.1007/978-3-642-24797-2]
  • [7] Gulcehre C., 2014, PROC C EMPIRICAL MET, P1724
  • [8] Guzmán F, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P6098
  • [9] The vanishing gradient problem during learning recurrent neural nets and problem solutions
    Hochreiter, S
    [J]. INTERNATIONAL JOURNAL OF UNCERTAINTY FUZZINESS AND KNOWLEDGE-BASED SYSTEMS, 1998, 6 (02) : 107 - 116
  • [10] Koehn P., 2017, ARXIV PREPRINT ARXIV