The Task of Post-Editing Machine Translation for the Low-Resource Language

被引:4
|
作者
Rakhimova, Diana [1 ,2 ]
Karibayeva, Aidana [1 ,2 ]
Turarbek, Assem [1 ]
机构
[1] Al Farabi Kazakh Natl Univ, Dept Informat Syst, Alma Ata 050040, Kazakhstan
[2] Inst Informat & Comp Technol, Alma Ata 050010, Kazakhstan
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 02期
关键词
machine translation; post-editing machine translation; light post-editing; full post-editing; BRNN; transformer; English; Kazakh; Uzbek; Russian; HANDLING UNKNOWN WORDS; PRODUCT;
D O I
10.3390/app14020486
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In recent years, machine translation has made significant advancements; however, its effectiveness can vary widely depending on the language pair. Languages with limited resources, such as Kazakh, Uzbek, Kalmyk, Tatar, and others, often encounter challenges in achieving high-quality machine translations. Kazakh is an agglutinative language with complex morphology, making it a low-resource language. This article addresses the task of post-editing machine translation for the Kazakh language. The research begins by discussing the history and evolution of machine translation and how it has developed to meet the unique needs of languages with limited resources. The research resulted in the development of a machine translation post-editing system. The system utilizes modern machine learning methods, starting with neural machine translation using the BRNN model in the initial post-editing stage. Subsequently, the transformer model is applied to further edit the text. Complex structural and grammatical forms are processed, and abbreviations are replaced. Practical experiments were conducted on various texts: news publications, legislative documents, IT sphere, etc. This article serves as a valuable resource for researchers and practitioners in the field of machine translation, shedding light on effective post-editing strategies to enhance translation quality, particularly in scenarios involving languages with limited resources such as Kazakh and Uzbek. The obtained results were tested and evaluated using specialized metrics-BLEU, TER, and WER.
引用
收藏
页数:19
相关论文
共 50 条
  • [21] Translation Quality and Error Recognition in Professional Neural Machine Translation Post-Editing
    Vardaro, Jennifer
    Schaeffer, Moritz
    Hansen-Schirra, Silvia
    INFORMATICS-BASEL, 2019, 6 (03):
  • [23] Translation Memories as Baselines for Low-Resource Machine Translation
    Knowles, Rebecca
    Littell, Patrick
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6759 - 6767
  • [24] Using the TED Talks to Evaluate Spoken Post-editing of Machine Translation
    Liyanapathirana, Jeevanthi
    Popescu-Belis, Andrei
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2232 - 2239
  • [25] Machine Translation and Post-editing: Impact of Training and Directionality on Quality and Productivity
    Toledo Baez, M. Cristina
    TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2018, (16): : 24 - 34
  • [26] AuthorsDecisions in projects using machine translation and post-editing - an interview study
    Nitzke, Jean
    Canfora, Carmen
    Hansen-Schirra, Silvia
    Kapnas, Dimitrios
    JOURNAL OF SPECIALISED TRANSLATION, 2024, (41): : 127 - 148
  • [27] MACHINE TRANSLATION AND POST-EDITING IN WILDLIFE DOCUMENTARIES: CHALLENGES AND POSSIBLE SOLUTIONS
    Ortiz-Boix, Carla
    HERMENEUS, 2016, (18): : 269 - 313
  • [28] MACHINE TRANSLATION AND POST-EDITING: PROFILES AND COMPETENCES IN TRANSLATOR TRAINING PROGRAMMES
    Cid-Leal, Pilar
    Espin-Garcia, Maria-Carmen
    Presas, Marisa
    MONTI, 2019, 11 : 187 - 212
  • [29] Post-editing of machine translation while reading on English proficiency levels
    Kim, Hea-Suk
    Cha, Yoonjung
    LINGUISTIC RESEARCH, 2023, 40 : 89 - 126
  • [30] Perspectives and use of machine translation and post-editing in audiovisual translation: The point of view of professionals
    Martin, Jose Fernando Carrero
    Oliver, Beatriz Reverter
    TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2024, (22): : 302 - 322