Terminology Translation in Low-Resource Scenarios

被引:2
|
作者
Haque, Rejwanul [1 ]
Hasanuzzaman, Mohammed [2 ]
Way, Andy [1 ]
机构
[1] Dublin City Univ, Sch Comp, Dublin 9, Glasnevin, Ireland
[2] Cork Inst Technol, Dept Comp Sci, Cork T12 P928, Ireland
基金
爱尔兰科学基金会;
关键词
machine translation; terminology translation; phrase-based statistical machine translation; neural machine translation; terminology translation evaluation;
D O I
10.3390/info10090273
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Term translation quality in machine translation (MT), which is usually measured by domain experts, is a time-consuming and expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems often need to be updated for many reasons (e.g., availability of new training data, leading MT techniques). To the best of our knowledge, as of yet, there is no publicly-available solution to evaluate terminology translation in MT automatically. Hence, there is a genuine need to have a faster and less-expensive solution to this problem, which could help end-users to identify term translation problems in MT instantly. This study presents a faster and less expensive strategy for evaluating terminology translation in MT. High correlations of our evaluation results with human judgements demonstrate the effectiveness of the proposed solution. The paper also introduces a classification framework, TermCat, that can automatically classify term translation-related errors and expose specific problems in relation to terminology translation in MT. We carried out our experiments with a low resource language pair, English-Hindi, and found that our classifier, whose accuracy varies across the translation directions, error classes, the morphological nature of the languages, and MT models, generally performs competently in the terminology translation classification task.
引用
收藏
页数:28
相关论文
共 50 条
  • [31] Efficient Adaptation: Enhancing Multilingual Models for Low-Resource Language Translation
    Sel, Ilhami
    Hanbay, Davut
    MATHEMATICS, 2024, 12 (19)
  • [32] Enhancing distant low-resource neural machine translation with semantic pivot
    Zhu, Enchang
    Huang, Yuxin
    Xian, Yantuan
    Zhu, Junguo
    Gao, Minghu
    Yu, Zhiqiang
    ALEXANDRIA ENGINEERING JOURNAL, 2025, 116 : 633 - 643
  • [33] Empirical Analysis on Machine Translation Possibilities for Low-Resource Santali Language
    Sahoo, Sunil Kumar
    Biswal, Bhramara Bar
    Dash, Satya Ranjan
    Patra, Jyotiprakash
    INTERNATIONAL JOURNAL ON ARTIFICIAL INTELLIGENCE TOOLS, 2025,
  • [34] A Content Word Augmentation Method for Low-Resource Neural Machine Translation
    Li, Fuxue
    Zhao, Zhongchao
    Chi, Chuncheng
    Yan, Hong
    Zhang, Zhen
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, ICIC 2023, PT IV, 2023, 14089 : 720 - 731
  • [35] Analysing terminology translation errors in statistical and neural machine translation
    Haque, Rejwanul
    Hasanuzzaman, Mohammed
    Way, Andy
    MACHINE TRANSLATION, 2020, 34 (2-3) : 149 - 195
  • [36] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
    Salam Michael Singh
    Thoudam Doren Singh
    Neural Computing and Applications, 2022, 34 : 14823 - 14844
  • [37] An empirical study of low-resource neural machine translation of manipuri in multilingual settings
    Singh, Salam Michael
    Singh, Thoudam Doren
    NEURAL COMPUTING & APPLICATIONS, 2022, 34 (17) : 14823 - 14844
  • [38] Fully Attentional Network for Low-Resource Academic Machine Translation and Post Editing
    Sel, Ilhami
    Hanbay, Davut
    APPLIED SCIENCES-BASEL, 2022, 12 (22):
  • [39] Pre-Training on Mixed Data for Low-Resource Neural Machine Translation
    Zhang, Wenbo
    Li, Xiao
    Yang, Yating
    Dong, Rui
    INFORMATION, 2021, 12 (03)
  • [40] How to choose the best pivot language for automatic translation of low-resource languages
    Paul, Michael
    Finch, Andrew
    Sumita, Eiichrio
    ACM Transactions on Asian Language Information Processing, 2013, 12 (04):