Methods for integrating rule-based and statistical systems for Arabic to English machine translation

被引:4
作者
Zbib, Rabih [1 ]
Kayser, Michael [2 ]
Matsoukas, Spyros [2 ]
Makhoul, John [2 ]
Nader, Hazem [3 ]
Soliman, Hamdy [3 ]
Safadi, Rami [3 ]
机构
[1] MIT, 77 Massachusetts Ave, Cambridge, MA 02139 USA
[2] BBN Technol, Cambridge, MA 02138 USA
[3] Sakhr Software, Cairo 11771, Egypt
关键词
Statistical machine translation; Rule-based machine translation; System integration; Arabic;
D O I
10.1007/s10590-011-9106-9
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This article presents several techniques for integrating information from a rule-based machine translation (RBMT) system into a statistical machine translation (SMT) framework. These techniques are grouped into three parts that correspond to the type of information integrated: the morphological, lexical, and system levels. The first part presents techniques that use information from a rule-based morphological tagger to do morpheme splitting of the Arabic source text. We also compare with the results of using a statistical morphological tagger. In the second part, we present two ways of using Arabic diacritics to improve SMT results, both based on binary decision trees. The third part presents a system combination method that combines the outputs of the RBMT and the SMT systems, leveraging the strength of each. This article shows how language specific information obtained through a deterministic rule-based process can be used to improve SMT, which is mostly language-independent.
引用
收藏
页码:67 / 83
页数:17
相关论文
共 30 条
[1]  
Badr Ibrahim, 2008, P 46 ANN M ASS COMP, P153
[2]  
BANERJEE S, 2005, P ACL 2005 WORKSH IN
[3]  
Brunning J, 2009, NAACL HLT 2009, P110
[4]  
Buckwalter Tim, 2004, BUCKWALTER ARABIC MO
[5]  
Chen Y, 2010, P 14 ANN C EUR ASS M
[6]  
Devlin Jacob, 2009, THESIS
[7]  
DIAB M, 2007, P MACHINE TRANSLATIO, P143
[8]  
Habash N, 2006, P 2006 HUM LANG TECH
[9]  
Habash N., 2005, P 43 ANN M ASS COMP
[10]   A FORMAL BASIS FOR HEURISTIC DETERMINATION OF MINIMUM COST PATHS [J].
HART, PE ;
NILSSON, NJ ;
RAPHAEL, B .
IEEE TRANSACTIONS ON SYSTEMS SCIENCE AND CYBERNETICS, 1968, SSC4 (02) :100-+