Terminology Translation in Low-Resource Scenarios

被引:2
|
作者
Haque, Rejwanul [1 ]
Hasanuzzaman, Mohammed [2 ]
Way, Andy [1 ]
机构
[1] Dublin City Univ, Sch Comp, Dublin 9, Glasnevin, Ireland
[2] Cork Inst Technol, Dept Comp Sci, Cork T12 P928, Ireland
基金
爱尔兰科学基金会;
关键词
machine translation; terminology translation; phrase-based statistical machine translation; neural machine translation; terminology translation evaluation;
D O I
10.3390/info10090273
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Term translation quality in machine translation (MT), which is usually measured by domain experts, is a time-consuming and expensive task. In fact, this is unimaginable in an industrial setting where customised MT systems often need to be updated for many reasons (e.g., availability of new training data, leading MT techniques). To the best of our knowledge, as of yet, there is no publicly-available solution to evaluate terminology translation in MT automatically. Hence, there is a genuine need to have a faster and less-expensive solution to this problem, which could help end-users to identify term translation problems in MT instantly. This study presents a faster and less expensive strategy for evaluating terminology translation in MT. High correlations of our evaluation results with human judgements demonstrate the effectiveness of the proposed solution. The paper also introduces a classification framework, TermCat, that can automatically classify term translation-related errors and expose specific problems in relation to terminology translation in MT. We carried out our experiments with a low resource language pair, English-Hindi, and found that our classifier, whose accuracy varies across the translation directions, error classes, the morphological nature of the languages, and MT models, generally performs competently in the terminology translation classification task.
引用
收藏
页数:28
相关论文
共 50 条
  • [41] A Bilingual Templates Data Augmentation Method for Low-Resource Neural Machine Translation
    Li, Fuxue
    Liu, Beibei
    Yan, Hong
    Shao, Mingzhi
    Xie, Peijun
    Li, Jiarui
    Chi, Chuncheng
    ADVANCED INTELLIGENT COMPUTING TECHNOLOGY AND APPLICATIONS, PT III, ICIC 2024, 2024, 14877 : 40 - 51
  • [42] Multi-granularity Knowledge Sharing in Low-resource Neural Machine Translation
    Mi, Chenggang
    Xie, Shaoliang
    Fan, Yi
    ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2024, 23 (02)
  • [43] Incident-Driven Machine Translation and Name Tagging for Low-resource Languages
    Hermjakob, Ulf
    Li, Qiang
    Marcu, Daniel
    May, Jonathan
    Mielke, Sebastian J.
    Pourdamghani, Nima
    Pust, Michael
    Shi, Xing
    Knight, Kevin
    Levinboim, Tomer
    Murray, Kenton
    Chiang, David
    Zhang, Boliang
    Pan, Xiaoman
    Lu, Di
    Lin, Ying
    Ji, Heng
    MACHINE TRANSLATION, 2018, 32 (1-2) : 59 - 89
  • [44] Mismatching-aware unsupervised translation quality estimation for low-resource languages
    Azadi, Fatemeh
    Faili, Heshaam
    Dousti, Mohammad Javad
    LANGUAGE RESOURCES AND EVALUATION, 2024, 58 (04) : 1207 - 1231
  • [45] Extremely Low-resource Multilingual Neural Machine Translation for Indic Mizo Language
    Lalrempuii C.
    Soni B.
    International Journal of Information Technology, 2023, 15 (8) : 4275 - 4282
  • [46] A data-guided curriculum towards low-resource neural machine translation
    Wang, Jing
    Yang, Lina
    Wang, Jiale
    Guan, Yunguang
    Bai, Lin
    Luo, Huiwu
    EXPERT SYSTEMS WITH APPLICATIONS, 2025, 283
  • [47] Machine Translation of Low-Resource Spoken Dialects: Strategies for Normalizing Swiss German
    Honnet, Pierre-Edouard
    Popescu-Belis, Andrei
    Musat, Claudiu
    Baeriswyl, Michael
    PROCEEDINGS OF THE ELEVENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION (LREC 2018), 2018, : 3781 - 3788
  • [48] DRA: dynamic routing attention for neural machine translation with low-resource languages
    Wang, Zhenhan
    Song, Ran
    Yu, Zhengtao
    Mao, Cunli
    Gao, Shengxiang
    INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2024,
  • [49] STA: An efficient data augmentation method for low-resource neural machine translation
    Li, Fuxue
    Chi, Chuncheng
    Yan, Hong
    Liu, Beibei
    Shao, Mingzhi
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2023, 45 (01) : 121 - 132
  • [50] A Recipe for Low-Resource NMT
    Wdowiak, Eryk
    INTELLIGENT COMPUTING, VOL 2, 2022, 507 : 739 - 746