Comparison and system combination of n-gram-based and syntax-based machine translation systems

被引:0
作者
Khalilov, Maxim [1 ]
Fonollosa, Jose A. R. [1 ]
机构
[1] Univ Politecn Cataluna, Ctr Recerca TALP, Campus Nord,C Jordi Girona 1-3, Barcelona, Spain
来源
PROCESAMIENTO DEL LENGUAJE NATURAL | 2008年 / 41期
关键词
Statistical machine translation; syntax-based translation; n-grams; system combination;
D O I
暂无
中图分类号
H0 [语言学];
学科分类号
030303 ; 0501 ; 050102 ;
摘要
In this paper we shall compare two approaches to machine translation: the Syntax Augmented Machine Translation system (SAMT), which is a syntaxdriven translation system, underlain by phrase-based model, and the n-gram-based Statistical Machine Translation (SMT), in which a translation process is based on statistical modeling of the bilingual context. We provide a step-by-step comparison of the systems, reporting results in terms of automatic evaluation metrics and required computational resources for a smaller Arabic-to-English translation task from the news domain. Finally, we combine the output of both systems that yield to significant improvement of translation quality.
引用
收藏
页码:259 / 266
页数:8
相关论文
共 18 条
[1]  
Brants T., 2000, P ANLP 2000
[2]  
Casacuberta F., 2002, P WORKSH SPEECH TO S, P39
[3]  
Charniak E, 2000, 6TH APPLIED NATURAL LANGUAGE PROCESSING CONFERENCE/1ST MEETING OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE CONFERENCE AND PROCEEDINGS OF THE ANLP-NAACL 2000 STUDENT RESEARCH WORKSHOP, pA132
[4]  
Chiang David, 2005, P 43 ANN M ASS COMP, P263, DOI DOI 10.3115/1219840.1219873
[5]  
Crego JM, 2005, INTERSPEECH 2005, P3185
[6]  
Eisner J., 2003, COMPANION VOLUME P 4, P205
[7]  
Habash N., 2006, P HUMAN LANGUAGE TEC, P49
[8]  
Koehn P., 2004, EMNLP, P388
[9]  
Koehn P., 2003, P 2003 C N AM CHAPT, P48, DOI DOI 10.3115/1073445.1073462
[10]  
Kumar Shankar, 2004, P HLT NAACL 2004