Classification-Based Approach for Hybridizing Statistical and Rule-Based Machine Translation

被引:4
|
作者
Park, Eun-Jin [1 ]
Kwon, Oh-Woog [1 ]
Kim, Kangil [1 ]
Kim, Young-Kil [1 ]
机构
[1] ETRI, SW & Contents Res Lab, Taejon, South Korea
关键词
Machine translation; hybrid machine translation; automatic labeling; rule-based machine translation; statistical machine translation;
D O I
10.4218/etrij.15.0114.1017
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
In this paper, we propose a classification-based approach for hybridizing statistical machine translation and rule-based machine translation. Both the training dataset used in the learning of our proposed classifier and our feature extraction method affect the hybridization quality. To create one such training dataset, a previous approach used auto-evaluation metrics to determine from a set of component machine translation (MT) systems which gave the more accurate translation (by a comparative method). Once this had been determined, the most accurate translation was then labelled in such a way so as to indicate the MT system from which it came. In this previous approach, when the metric evaluation scores were low, there existed a high level of uncertainty as to which of the component MT systems was actually producing the better translation. To relax such uncertainty or error in classification, we propose an alternative approach to such labeling; that is, a cut-off method. In our experiments, using the aforementioned cut-off method in our proposed classifier, we managed to achieve a translation accuracy of 81.5% - a 5.0% improvement over existing methods.
引用
收藏
页码:541 / 550
页数:10
相关论文
共 50 条
  • [1] Hybridizing rule-based and example-based approaches in machine aided translation system
    Sinha, RMK
    IC-AI'2000: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 1-III, 2000, : 1247 - 1252
  • [2] Rule-based Reordering Space in Statistical Machine Translation
    Pecheux, Nicolas
    Allauzen, Alexandre
    Yvon, Francois
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1800 - 1806
  • [3] MACHINE TRANSLATION: A CRITICAL LOOK AT THE PERFORMANCE OF RULE-BASED AND STATISTICAL MACHINE TRANSLATION
    Banitz, Brita
    CADERNOS DE TRADUCAO, 2020, 40 (01): : 54 - 71
  • [4] Hybrid Translation with Classification: Revisiting Rule-Based and Neural Machine Translation
    Huang, Jin-Xia
    Lee, Kyung-Soon
    Kim, Young-Kil
    ELECTRONICS, 2020, 9 (02)
  • [5] Methods for integrating rule-based and statistical systems for Arabic to English machine translation
    Zbib, Rabih
    Kayser, Michael
    Matsoukas, Spyros
    Makhoul, John
    Nader, Hazem
    Soliman, Hamdy
    Safadi, Rami
    MACHINE TRANSLATION, 2012, 26 (1-2) : 67 - 83
  • [6] Classical Arabic English machine translation using rule-based approach
    Hebresha, Huda Alhusain
    Aziz, Mohd Juzaiddin Ab
    Journal of Applied Sciences, 2013, 13 (01) : 79 - 86
  • [7] Improving Neural Machine Translation Using Rule-Based Machine Translation
    Singh, Muskaan
    Kumar, Ravinder
    Chana, Inderveer
    2019 7TH INTERNATIONAL CONFERENCE ON SMART COMPUTING & COMMUNICATIONS (ICSCC), 2019, : 8 - 12
  • [8] STUDY AND COMPARISON OF RULE-BASED AND STATISTICAL CATALAN-SPANISH MACHINE TRANSLATION SYSTEMS
    Costa-Jussa, Marta R.
    Farrus, Mireia
    Marino, Jose B.
    Fonollosa, Jose A. R.
    COMPUTING AND INFORMATICS, 2012, 31 (02) : 245 - 270
  • [9] Statistical vs. Rule-Based Machine Translation: A Comparative Study on Indian Languages
    Sreelekha, S.
    Bhattacharyya, Pushpak
    Malathi, D.
    INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTING AND APPLICATIONS, ICICA 2016, 2018, 632 : 663 - 675
  • [10] Adopting new rules in Rule-Based Machine Translation
    Abu Shquier, Mohammad M.
    Al Nabhan, Mohammed M.
    Sembok, Tengku Mohammed
    2010 12TH INTERNATIONAL CONFERENCE ON COMPUTER MODELLING AND SIMULATION (UKSIM), 2010, : 62 - 67