A Survey of Orthographic Information in Machine Translation

被引:4
|
作者
Chakravarthi B.R. [1 ]
Rani P. [1 ]
Arcan M. [2 ]
McCrae J.P. [1 ]
机构
[1] Unit for Linguistic Data, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway, Galway
[2] Unit for Natural Language Processing, Insight SFI Research Centre for Data Analytics, Data Science Institute, National University of Ireland Galway, Galway
基金
欧盟地平线“2020”; 爱尔兰科学基金会;
关键词
Machine translation; Neural machine translation; Orthography; Rule-based machine translation; Statistical machine translation; Under-resourced languages;
D O I
10.1007/s42979-021-00723-4
中图分类号
学科分类号
摘要
Machine translation is one of the applications of natural language processing which has been explored in different languages. Recently researchers started paying attention towards machine translation for resource-poor languages and closely related languages. A widespread and underlying problem for these machine translation systems is the linguistic difference and variation in orthographic conventions which causes many issues to traditional approaches. Two languages written in two different orthographies are not easily comparable but orthographic information can also be used to improve the machine translation system. This article offers a survey of research regarding orthography’s influence on machine translation of under-resourced languages. It introduces under-resourced languages in terms of machine translation and how orthographic information can be utilised to improve machine translation. We describe previous work in this area, discussing what underlying assumptions were made, and showing how orthographic knowledge improves the performance of machine translation of under-resourced languages. We discuss different types of machine translation and demonstrate a recent trend that seeks to link orthographic information with well-established machine translation methods. Considerable attention is given to current efforts using cognate information at different levels of machine translation and the lessons that can be drawn from this. Additionally, multilingual neural machine translation of closely related languages is given a particular focus in this survey. This article ends with a discussion of the way forward in machine translation with orthographic information, focusing on multilingual settings and bilingual lexicon induction. © 2021, The Author(s).
引用
收藏
相关论文
共 50 条
  • [21] A Study on The Survey, Application and Future of Machine Translation
    Zhang, Shidong
    Peng, Shuang
    PROCEEDINGS OF THE SEVENTH NORTHWAST ASIA INTERNATIONAL SYMPOSIUM ON LANGUAGE, LITERATURE AND TRANSLATION, 2018, : 69 - 74
  • [22] A Survey Of Low Resource Neural Machine Translation
    Liu, Ding
    Ma, Ning
    Yang, Fangtao
    Yang, Xuebin
    2019 4TH INTERNATIONAL CONFERENCE ON MECHANICAL, CONTROL AND COMPUTER ENGINEERING (ICMCCE 2019), 2019, : 39 - 42
  • [23] A survey of machine translation competences: Insights for translation technology educators and practitioners
    Gaspari, Federico
    Almaghout, Hala
    Doherty, Stephen
    PERSPECTIVES-STUDIES IN TRANSLATOLOGY, 2015, 23 (03): : 333 - 358
  • [24] Introducing Machine Translation in the Translation Classroom: A Survey on Students' Attitudes and Perceptions
    Gonzalez Pastor, Diana
    TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2021, (19): : 47 - 65
  • [25] USING MACHINE TRANSLATION ENGINES IN THE CLASSROOM: A SURVEY OF TRANSLATION STUDENTS' PERFORMANCE
    Olkhovska, Alla
    Frolova, Iryna
    ADVANCED EDUCATION, 2020, (15) : 47 - 55
  • [26] Information Retrieval System and Machine Translation: A Review
    Madankar, Mangala
    Chandak, M. B.
    Chavhan, Nekita
    1ST INTERNATIONAL CONFERENCE ON INFORMATION SECURITY & PRIVACY 2015, 2016, 78 : 845 - 850
  • [27] Residual Information Flow for Neural Machine Translation
    Mohamed, Shereen A.
    Abdou, Mohamed A.
    Elsayed, Ashraf A.
    IEEE ACCESS, 2022, 10 : 118313 - 118320
  • [28] Using Contextual Information for Machine Translation Evaluation
    Fomicheva, Marina
    Bel, Nuria
    LREC 2016 - TENTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2016, : 2755 - 2761
  • [29] Machine translation by projecting text into the same phonetic-orthographic space using a common encoding
    Kumar A.
    Parida S.
    Pratap A.
    Singh A.K.
    Sadhana - Academy Proceedings in Engineering Sciences, 2023, 48 (04)
  • [30] Multimodality information fusion for automated machine translation
    Li, Lin
    Tayir, Turghun
    Han, Yifeng
    Tao, Xiaohui
    Velasquez, Juan D.
    INFORMATION FUSION, 2023, 91 : 352 - 363