The Task of Post-Editing Machine Translation for the Low-Resource Language

被引:4
|
作者
Rakhimova, Diana [1 ,2 ]
Karibayeva, Aidana [1 ,2 ]
Turarbek, Assem [1 ]
机构
[1] Al Farabi Kazakh Natl Univ, Dept Informat Syst, Alma Ata 050040, Kazakhstan
[2] Inst Informat & Comp Technol, Alma Ata 050010, Kazakhstan
来源
APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 02期
关键词
machine translation; post-editing machine translation; light post-editing; full post-editing; BRNN; transformer; English; Kazakh; Uzbek; Russian; HANDLING UNKNOWN WORDS; PRODUCT;
D O I
10.3390/app14020486
中图分类号
O6 [化学];
学科分类号
0703 ;
摘要
In recent years, machine translation has made significant advancements; however, its effectiveness can vary widely depending on the language pair. Languages with limited resources, such as Kazakh, Uzbek, Kalmyk, Tatar, and others, often encounter challenges in achieving high-quality machine translations. Kazakh is an agglutinative language with complex morphology, making it a low-resource language. This article addresses the task of post-editing machine translation for the Kazakh language. The research begins by discussing the history and evolution of machine translation and how it has developed to meet the unique needs of languages with limited resources. The research resulted in the development of a machine translation post-editing system. The system utilizes modern machine learning methods, starting with neural machine translation using the BRNN model in the initial post-editing stage. Subsequently, the transformer model is applied to further edit the text. Complex structural and grammatical forms are processed, and abbreviations are replaced. Practical experiments were conducted on various texts: news publications, legislative documents, IT sphere, etc. This article serves as a valuable resource for researchers and practitioners in the field of machine translation, shedding light on effective post-editing strategies to enhance translation quality, particularly in scenarios involving languages with limited resources such as Kazakh and Uzbek. The obtained results were tested and evaluated using specialized metrics-BLEU, TER, and WER.
引用
收藏
页数:19
相关论文
共 50 条
  • [31] Extremely Low-resource Multilingual Neural Machine Translation for Indic Mizo Language
    Lalrempuii C.
    Soni B.
    International Journal of Information Technology, 2023, 15 (8) : 4275 - 4282
  • [32] Transformers for Low-resource Neural Machine Translation
    Gezmu, Andargachew Mekonnen
    Nuernberger, Andreas
    ICAART: PROCEEDINGS OF THE 14TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE - VOL 1, 2022, : 459 - 466
  • [33] AAVE Corpus Generation and Low-Resource Dialect Machine Translation
    Graves, Eric
    Aswar, Shreyas
    Desai, Rujuta
    Nampelli, Srilekha
    Chakraborty, Sunandan
    Hall, Ted
    PROCEEDINGS OF THE ACM SIGCAS/SIGCHI CONFERENCE ON COMPUTING AND SUSTAINABLE SOCIETIES 2024, COMPASS 2024, 2024, : 50 - 59
  • [34] Boosting the Transformer with the BERT Supervision in Low-Resource Machine Translation
    Yan, Rong
    Li, Jiang
    Su, Xiangdong
    Wang, Xiaoming
    Gao, Guanglai
    APPLIED SCIENCES-BASEL, 2022, 12 (14):
  • [35] Translators' perceptions of literary post-editing using statistical and neural machine translation
    Moorkens, Joss
    Toral, Antonio
    Castilho, Sheila
    Way, Andy
    TRANSLATION SPACES, 2018, 7 (02) : 240 - 262
  • [36] USING POST-EDITING IN TRANSLATION AND LSP COURSES
    Udina, Natalia
    PROCEEDINGS OF INTCESS 2019- 6TH INTERNATIONAL CONFERENCE ON EDUCATION AND SOCIAL SCIENCES, 2019, : 1097 - 1101
  • [37] Cognitive effort in human translation and machine translation post-editing processes A holistic and phased view
    Wang, Yu
    Jalalian Daghigh, Ali
    FORUM-REVUE INTERNATIONALE D INTERPRETATION ET DE TRADUCTION-INTERNATIONAL JOURNAL OF INTERPRETATION AND TRANSLATION, 2023, 21 (01): : 139 - 162
  • [38] Can college students be post-editors? An investigation into employing language learners in machine translation plus post-editing settings
    Yamada, Masaru
    MACHINE TRANSLATION, 2015, 29 (01) : 49 - 67
  • [39] Identifying the Machine Translation Error Types with the Greatest Impact on Post-editing Effort
    Daems, Joke
    Vandepitte, Sonia
    Hartsuiker, Robert J.
    Macken, Lieve
    FRONTIERS IN PSYCHOLOGY, 2017, 8
  • [40] POST-EDITING PRACTICE IN SPECIALIZED TRANSLATION TRAINING
    Alvarez Garcia, Carmen
    CARACTERES-ESTUDIOS CULTURALES Y CRITICOS DE LA ESFERA DIGITAL, 2019, 8 (02): : 67 - 91