Polyglot machine translation

被引:3
作者
Leiva, Luis A. [1 ,2 ]
Alabau, Vicent [1 ,2 ]
机构
[1] CPI UPV, Sciling, Valencia 46022, Spain
[2] Univ Politecn Valencia, E-46022 Valencia, Spain
关键词
Minority languages; machine translation; linguistic coverage; vocabulary; human factors;
D O I
10.3233/JIFS-152533
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Machine Translation (MT) requires a large amount of linguistic resources, which leads current MT systems to leaving unknown words untranslated. This can be annoying for end users, as they might not understand at all such untranslated words. However, most language families share a common vocabulary, therefore this knowledge can be leveraged to produce more understandable translations, typically for "assimilation" or gisting use. Based on this observation, we propose a method that constructs polyglot translations tailored to a particular user language. Simply put, an unknown word is translated into a set of languages that relate to the user's language, and the translated word that is closest to the user's language is used as a replacement of the unknown word. Experimental results on language coverage over three language families indicate that our method may improve the usefulness of MT systems. As confirmed by a subsequent human evaluation, polyglot translations look indeed familiar to the users, and are perceived to be easier to read and understand than translations in their related natural languages.
引用
收藏
页码:613 / 627
页数:15
相关论文
共 28 条
  • [1] Adolphs S., 2003, APPL LINGUISTICS, V24
  • [2] [Anonymous], P REC ADV NAT LANG P
  • [3] [Anonymous], WORKSH STAT MACH TRA
  • [4] Beeke R. S. P., 2011, COMP INDOEUROPEAN LI
  • [5] Boguslaysky I., 2009, APPL LINGUISTICS, V30
  • [6] Brooke John., 1996, Usability evaluation in industry, V189, P4, DOI DOI 10.1201/9781498710411-35/SUS-QUICK-DIRTY-USABILITY-SCALE-JOHN-BROOKE
  • [7] Cohn Trevor, 2007, P ANN M ASS COMP LIN
  • [8] Habash N., 2008, P ANN M ASS COMP LIN
  • [9] Heredia R., 2001, CURRENT DIRECTIONS P, V10
  • [10] Hickey Raymond, 2010, The Handbook of Language Contact