The Task of Post-Editing Machine Translation for the Low-Resource Language

被引：4

作者：

Rakhimova, Diana ^{[1
,2
]}

Karibayeva, Aidana ^{[1
,2
]}

Turarbek, Assem ^{[1
]}

机构：

[1] Al Farabi Kazakh Natl Univ, Dept Informat Syst, Alma Ata 050040, Kazakhstan

[2] Inst Informat & Comp Technol, Alma Ata 050010, Kazakhstan

来源：

APPLIED SCIENCES-BASEL | 2024年 / 14卷 / 02期

关键词：

machine translation; post-editing machine translation; light post-editing; full post-editing; BRNN; transformer; English; Kazakh; Uzbek; Russian; HANDLING UNKNOWN WORDS; PRODUCT;

D O I：

10.3390/app14020486

中图分类号：

O6 [化学];

学科分类号：

0703 ;

摘要：

In recent years, machine translation has made significant advancements; however, its effectiveness can vary widely depending on the language pair. Languages with limited resources, such as Kazakh, Uzbek, Kalmyk, Tatar, and others, often encounter challenges in achieving high-quality machine translations. Kazakh is an agglutinative language with complex morphology, making it a low-resource language. This article addresses the task of post-editing machine translation for the Kazakh language. The research begins by discussing the history and evolution of machine translation and how it has developed to meet the unique needs of languages with limited resources. The research resulted in the development of a machine translation post-editing system. The system utilizes modern machine learning methods, starting with neural machine translation using the BRNN model in the initial post-editing stage. Subsequently, the transformer model is applied to further edit the text. Complex structural and grammatical forms are processed, and abbreviations are replaced. Practical experiments were conducted on various texts: news publications, legislative documents, IT sphere, etc. This article serves as a valuable resource for researchers and practitioners in the field of machine translation, shedding light on effective post-editing strategies to enhance translation quality, particularly in scenarios involving languages with limited resources such as Kazakh and Uzbek. The obtained results were tested and evaluated using specialized metrics-BLEU, TER, and WER.

引用

页数：19

共 50 条

[1] On the correctness of machine translation: A machine translation post-editing task
Koponen, Maarit
Salmi, Leena
JOURNAL OF SPECIALISED TRANSLATION, 2015, (23): : 117 - 135
[2] Training in machine translation post-editing for foreign language students
Zhang, Hong
Torres-Hostench, Olga
LANGUAGE LEARNING & TECHNOLOGY, 2022, 26 (01):
[3] Fully Attentional Network for Low-Resource Academic Machine Translation and Post Editing
Sel, Ilhami
Hanbay, Davut
APPLIED SCIENCES-BASEL, 2022, 12 (22):
[4] Second language learners' post-editing strategies for machine translation errors
Shin, Dongkawang
Chon, Yuah V.
LANGUAGE LEARNING & TECHNOLOGY, 2023, 27 (01):
[5] The neural machine translation models for the low-resource Kazakh-English language pair
Karyukin, Vladislav
Rakhimova, Diana
Karibayeva, Aidana
Turganbayeva, Aliya
Turarbek, Asem
PEERJ COMPUTER SCIENCE, 2023, 9
[6] Evaluating the use of machine translation post-editing in the foreign language class
Nino, Ana
COMPUTER ASSISTED LANGUAGE LEARNING, 2008, 21 (01) : 29 - 49
[7] Mind the gap The nature of machine translation post-editing
Rico, Celia
BABEL-REVUE INTERNATIONALE DE LA TRADUCTION-INTERNATIONAL JOURNAL OF TRANSLATION, 2022, 68 (05): : 697 - 722
[8] Maximum Entropy Model of Synonym Selection in Post-editing Machine Translation into Kazakh Language
Shormakova, Assem
Tukeyev, Ualsher
ADVANCES IN COMPUTATIONAL COLLECTIVE INTELLIGENCE, ICCCI 2024, PT II, 2024, 2166 : 111 - 123
[9] Is machine translation post-editing worth the effort? A survey of research into post-editing and effort
Koponen, Maarit
JOURNAL OF SPECIALISED TRANSLATION, 2016, (25): : 131 - 148
[10] Post-Editing Machine Translation As an FSL Exercise
Kliffer, Michael D.
PORTA LINGUARUM, 2008, (09) : 53 - 67

← 1 2 3 4 5 →