A Neural Attention-Based Encoder-Decoder Approach for English to Bangla Translation

被引:0
作者
Al Shiam, Abdullah [1 ]
Redwan, Sadi Md. [2 ]
Kabir, Humaun [3 ]
Shin, Jungpil [4 ]
机构
[1] Sheikh Hasina Univ, Dept Comp Sci & Engn, Netrokona 2400, Bangladesh
[2] Univ Rajshahi, Dept Comp Sci & Engn, Rajshahi 6205, Bangladesh
[3] Bangamata Sheikh Fojilatunnesa Mujib Sci & Technol, Dept Comp Sci & Engn, Jamalpur 2012, Bangladesh
[4] Univ Aizu Aizuwakamatsu, Sch Comp Sci & Engn, Fukushima 9658580, Japan
关键词
Neural Machine Translation (NMT); Machine Translation (MT); Encoder-Decoder Model; Neural Attention;
D O I
10.56415/csjm.v31.04
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
Machine translation (MT) is the process of translating text from one language to another using bilingual data sets and gram-matical rules. Recent works in the field of MT have popular-ized sequence-to-sequence models leveraging neural attention and deep learning. The success of neural attention models is yet to be construed into a robust framework for automated English-to-Bangla translation due to a lack of a comprehensive dataset that encompasses the diverse vocabulary of the Bangla language. In this study, we have proposed an English-to-Bangla MT system using an encoder-decoder attention model using the CCMatrix corpus. Our method shows that this model can outperform tra-ditional SMT and RBMT models with a Bilingual Evaluation Understudy (BLEU) score of 15.68 despite being constrained by the limited vocabulary of the corpus. We hypothesize that this model can be used successfully for state-of-the-art machine trans-lation with a more diverse and accurate dataset. This work can be extended further to incorporate several newer datasets using transfer learning techniques.
引用
收藏
页码:70 / 85
页数:16
相关论文
共 33 条
  • [1] Abadi M., 2016, P OSDI, P265
  • [2] Abujar S., 2021, Emerging Technologies in Data Mining and Information Security, V3, P359
  • [3] Al Mumin M. A., 2019, Journal of Computer Science, V11, P1627
  • [4] Alam F., 2019, 2019 22 INT C COMP I, P1
  • [5] Bidirectional Encoder-Decoder Model for Arabic Named Entity Recognition
    Ali, Mohammed N. A.
    Tan, Guanzheng
    [J]. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING, 2019, 44 (11) : 9693 - 9701
  • [6] Bostrom K, 2020, Arxiv, DOI arXiv:2004.03720
  • [7] Castilho Sheila, 2017, Prague Bulletin of Mathematical Linguistics, P109, DOI 10.1515/pralin-2017-0013
  • [8] Multimodal Encoder-Decoder Attention Networks for Visual Question Answering
    Chen, Chongqing
    Han, Dezhi
    Wang, Jun
    [J]. IEEE ACCESS, 2020, 8 : 35662 - 35671
  • [9] Cheragui M. A., 2012, ICWIT, P160
  • [10] Cho KYHY, 2014, Arxiv, DOI arXiv:1409.1259