Decorated Phrase Model and Syntax-Based Reordering Model for Statistical Machine Translation

被引:0
作者
Liang, Huashen [1 ]
Xue, Yongzeng [2 ]
Zhao, Tiejun [1 ]
机构
[1] Harbin Inst Technol, MOE MS Key Lab Nat Language Proc & Speech, Harbin 150001, Peoples R China
[2] Harbin Inst Technol, Dept New Media Technol & Art, Harbin 150001, Peoples R China
基金
国家高技术研究发展计划(863计划); 中国国家自然科学基金;
关键词
phrase-based statistical machine translation; reordering model; syntactic structure; syntax encapsulated phrase model;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In the past few years, much attention has been paid on extending phrase-based statistical machine translation with syntactic structures. In this paper, we introduce a novel phrase model, in which treebank tags are employed to decorate the bilingual phrase pairs. We use tag sequences, instead of phrase pairs, to train the lexicalized reordering model. Since the number of treebank tags is much smaller than the number of words, the tag sequence based reordering model is smaller and more accurate than the phrase based reordering model. Experiments were carried out on three types of models: the phrase model, the POS tag encapsulated phrase (PTEP) model and the syntactic tag encapsulated phrase (STEP) model. The STEP model obtained higher BLEU-4 score than other models on NIST MT tasks.
引用
收藏
页码:314 / 319
页数:6
相关论文
共 17 条
[1]  
[Anonymous], P 21 INT C COMP LING
[2]  
[Anonymous], P INT C COMP LING
[3]  
Cao Hailong, 2005, High Technology Letters (English Language Edition), V11, P359
[4]   Hierarchical phrase-based translation [J].
Chiang, David .
COMPUTATIONAL LINGUISTICS, 2007, 33 (02) :201-228
[5]  
Collins M, 2003, PLOUGHSHARES, V29, P29
[6]  
Koehn P, 2003, HLT-NAACL 2003: HUMAN LANGUAGE TECHNOLOGY CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, PROCEEDINGS OF THE MAIN CONFERENCE, P127
[7]  
Kumar S., 2005, Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT '05, P161
[8]  
Liu Y., 2007, Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics, P704
[9]  
Liu Y, 2006, COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, P609
[10]  
Marcu D., 2006, Proc. EMNLP, P44