Neural machine translation for limited resources English-Nyishi pair

被引:2
作者
Kakum, Nabam [1 ]
Laskar, Sahinur Rahman [2 ]
Sambyo, Koj [1 ]
Pakray, Partha [3 ]
机构
[1] Natl Inst Technol, Dept Comp Sci & Engn, Itanagar, Arunachal Prade, India
[2] Univ Petr & Energy Studies, Sch Comp Sci, Dehra Dun, India
[3] Natl Inst Technol, Dept Comp Sci & Engn, Silchar, India
来源
SADHANA-ACADEMY PROCEEDINGS IN ENGINEERING SCIENCES | 2023年 / 48卷 / 04期
关键词
English-Nyishi; NMT; low-resource; corpus;
D O I
10.1007/s12046-023-02308-8
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Neural machine translation handles sequential data over the variable length of input and output sentences and accomplishes a state-of-the-art method for the task of machine translation. Although the neural machine translation shows good performance in both low and high-resource language pairs translation, it requires adequate parallel training data. In low-resource language sets, the preparation of the corpus is strenuous and time-consuming. Automatic translation systems like Google and Bing cover under-resourced Indian languages, but lack the support of the Nyishi language. It is due to the lack of a suitable dataset. In this work, we have contributed a parallel corpus of low-resource language pairs, English-Nyishi, and reported comparative experiments on the baseline neural machine translation systems. The results are evaluated for English to Nyishi and vice-versa via well-known automatic evaluation metrics and manual evaluation.
引用
收藏
页数:12
相关论文
共 40 条
[1]  
Cho Kyunghyun., 2014, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing EMNLP, P1724
[2]  
Choudhary H., 2018, P 3 C MACH TRANSL SH, P770, DOI DOI 10.18653/V1/W18-6459
[3]  
Choudhary H, 2020, PROCEEDINGS OF THE 12TH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2020), P3610
[4]  
Chung J., 2014, Empirical evaluation of gated recurrent neural networks on sequence modeling
[5]  
Dzmitry B, 2014, 3 INT C LEARN REPR I, P1
[6]   A Convolutional Encoder Model for Neural Machine Translation [J].
Gehring, Jonas ;
Auli, Michael ;
Grangier, David ;
Dauphin, Yann N. .
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1, 2017, :123-135
[7]  
Goyal V, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020): STUDENT RESEARCH WORKSHOP, P162
[8]  
Gu Jiatao., 2018, P 2018 C N AM CHAPT, P344, DOI [DOI 10.18653/V1/N18-1032, 10.18653/v1/N18-1032]
[9]  
Guillaume L, 2019, NIPS 19 P 33 INT C N, P7059
[10]   Filtered Pseudo-parallel Corpus Improves Low-resource Neural Machine Translation [J].
Imankulova, Aizhan ;
Sato, Takayuki ;
Komachi, Mamoru .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2020, 19 (02)