Improve example-based machine translation quality for low-resource language using ontology

被引:1
作者
Khan Md Anwarus K.M.A. [1 ]
Yamada S. [2 ]
Tetsuro N. [3 ]
机构
[1] IBM Research Tokyo, 19-21 Nihonbashi, Hakozaki-cho, Chuo-ku, Tokyo
[2] NTT Corporation, NTT Hibiya Building, 1-1-6 Uchisaiwai-cho, Chiyoda-ku, Tokyo
[3] University of Electro-Communications, Graduate School of Informatics and Engineering, 1-5-1 Chofugaoka, Chofu, Tokyo
关键词
Example-based machine translation; Knowledge engineering; WordNet;
D O I
10.2991/ijndc.2017.5.3.6
中图分类号
学科分类号
摘要
In this research we propose to use ontology to improve the performance of an EBMT system for low-resource language pair. The EBMT architecture use chunk-string templates (CSTs) and unknown word translation mechanism. CSTs consist of a chunk in source-language, a string in target-language, and word alignment in-formation. For unknown word translation, we used WordNet hypernym tree and English-Bengali dictionary. CSTs improved the wide-coverage by 57 points and quality by 48.81 points in human evaluation. Currently 64.29% of the test-set translations by the system were acceptable. The combined solutions of CSTs and unknown words generated 67.85% acceptable translations from the test-set. Un-known words mechanism improved translation quality by 3.56 points in human evaluation. Copyright © 2017, the Authors.
引用
收藏
页码:176 / 191
页数:15
相关论文
共 19 条
  • [1] Steven A., Parsing by chunks, Principle-based Parsing, pp. 257-278, (1991)
  • [2] Saha D., Bandyopadhyay S., A Semantics-based English-Bengali EBMT System for translating News Headlines, Proceedings of The MT Summit X, Second Workshop on Example-based Machine Translation Programme, (2006)
  • [3] Saha D., Naskar S.K., Bandyopadhyay S., A Semantics-based English-Bengali EBMT System for Translating News Head-lines, MT Summit X, (2005)
  • [4] Miller G.A., WordNet: A lexical database for english, Communications of The ACM, 38, 11, pp. 39-41, (1995)
  • [5] Kim J.D., Brown R.D., Carbonell J.G., Chunk-based EBMT, (2010)
  • [6] Karim M.A., Technical challenges and design issues in bangla language processing, IGI Global, (2013)
  • [7] Md K., Salam A., Khan M., Nishino T., Example based English-Bengali machine translation using WordNet, Trisai, Tokyo, (2009)
  • [8] Md K., Salam A., Setsuo Y., Nishino T., English-Bengali Parallel Corpus: A Proposal, (2010)
  • [9] Md K., Salam A., Yamada S., Nishino T., Example-based machine translation for low-resource language using chunk-string templates, 13th Machine Translation Summit
  • [10] Md K., Salam A., Yamada S., Nishino T., Using WordNet to handle the out-of-vocabulary problem in English to bangla machine translation, Global Wordnet Conference, Matsue, Japan, pp. 35-39, (2012)