Dealing with unknowns in machine translation

被引:0
|
作者
Sinha, RMK [1 ]
机构
[1] Indian Inst Technol, Kanpur 208016, Uttar Pradesh, India
来源
2001 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS, VOLS 1-5: E-SYSTEMS AND E-MAN FOR CYBERNETICS IN CYBERSPACE | 2002年
关键词
machine translation; natural language processing; unknown words; English to Hindi;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
An 'unknown' is defined as a word for which there is no entry in the dictionary used by the translation system. In general, a text may contain several unknowns. These words may be names, acronyms, abbreviations, terminology or foreign words. It is a common practice in India to mix the words of English in Hindi and other Indian languages and vice-versa. However, the grammatical rules in construction of gender, number, verb-normalization or forms, conform to that for the language used irrespective of their origin. This gives rise to a frequent encounter of unknown words in day-to-day communication. A machine translation system has to provide mechanism for handling such unknowns. Spelling mistakes is yet another source that contributes to the unknowns. In this paper we describe a strategy being adopted in our system for machine aided translation from English to Hindi. No attempt has been made to expand the vocabulary by deriving their meaning. Instead, once an unknown is identified, a transliteration in Hindi with appropriate suffixes or appendage is used to substitute for their meaning. We use predictive parsing and a number of heuristics to identify the type of unknown.
引用
收藏
页码:940 / 944
页数:5
相关论文
共 50 条
  • [31] Machine translation: the (in)visible technology of audiovisual translation
    Oncins, Estella
    TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2022, (20): : 302 - 311
  • [32] A research into the Application of Machine Translation in the Translation Teaching
    Liu, Jin
    PROCEEDINGS OF THE 2017 INTERNATIONAL SEMINAR ON ARTIFICIAL INTELLIGENCE, NETWORKING AND INFORMATION TECHNOLOGY (ANIT 2017), 2017, 150 : 28 - 31
  • [33] The Machine Translation of Literature: Implications for Translation Pedagogy
    Omar, Abdulfattah
    Gomaa, Yasser A.
    INTERNATIONAL JOURNAL OF EMERGING TECHNOLOGIES IN LEARNING, 2020, 15 (11): : 228 - 235
  • [34] The machine translation of literature: Implications for translation pedagogy
    Omar A.
    Gomaa Y.A.
    Omar, Abdulfattah (a.abdelfattah@psau.edu.sa), 1600, Kassel University Press GmbH (15): : 228 - 235
  • [35] Terminology and machine translation
    Bell, Fiona
    Lemke, Mathias
    TRADUMATICA-TRADUCCIO I TECNOLOGIES DE LA INFORMACIO I LA COMUNICACIO, 2008, (06):
  • [36] Machine translation and postediting in the didactics of translation and interpreting
    Gonzalez Pastor, Diana
    Rico, Celia
    REVISTA DIGITAL DE INVESTIGACION EN DOCENCIA UNIVERSITARIA-RIDU, 2021, 15 (01):
  • [37] Improving Neural Machine Translation Using Rule-Based Machine Translation
    Singh, Muskaan
    Kumar, Ravinder
    Chana, Inderveer
    2019 7TH INTERNATIONAL CONFERENCE ON SMART COMPUTING & COMMUNICATIONS (ICSCC), 2019, : 8 - 12
  • [38] Hybrid Machine Translation by Combining Output from Multiple Machine Translation Systems
    Rikters, Matiss
    BALTIC JOURNAL OF MODERN COMPUTING, 2019, 7 (03): : 301 - 341
  • [39] Arabic Machine Translation: A survey of the latest trends and challenges
    Ameur, Mohamed Seghir Hadj
    Meziane, Farid
    Guessoum, Ahmed
    COMPUTER SCIENCE REVIEW, 2020, 38
  • [40] Evaluation of Machine Translation Approaches to Translate English to Bengali
    Nahar, Shamsun
    Huda, Mohammad Nurul
    Nur-E-Arefin, Md.
    Rahman, Mohammad Mahbubur
    2017 20TH INTERNATIONAL CONFERENCE OF COMPUTER AND INFORMATION TECHNOLOGY (ICCIT), 2017,