Inter-, Intra-, and Extra-Chunk Pre-Ordering for Statistical Japanese-to-English Machine Translation

被引:3
作者
Ding, Chenchen [1 ,2 ]
Sakanushi, Keisuke [1 ]
Touji, Hirona [1 ]
Yamamoto, Mikio [3 ]
机构
[1] Univ Tsukuba, Tsukuba, Ibaraki 3058573, Japan
[2] Natl Inst Informat & Commun Technol, Multilingual Translat Lab, 3-5 Hikaridai, Kyoto 6190289, Japan
[3] Univ Tsukuba, Dept Comp Sci, 1-1-1 Ten Nodai, Tsukuba, Ibaraki 3058573, Japan
关键词
Japanese; English; rule-based; pre-ordering; morpheme; chunk; dependency structure;
D O I
10.1145/2818381
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
A rule-based pre-ordering approach is proposed for statistical Japanese-to-English machine translation using the dependency structure of source-side sentences. A Japanese sentence is pre-ordered to an English-like order at the morpheme level for a statistical machine translation system during the training and decoding phase to resolve the reordering problem. In this article, extra-chunk pre-ordering of morphemes is proposed, which allows Japanese functional morphemes to move across chunk boundaries. This contrasts with the intra-chunk reordering used in previous approaches, which restricts the reordering of morphemes within a chunk. Linguistically oriented discussions show that correct pre-ordering cannot be realized without extra-chunk movement of morphemes. The proposed approach is compared with five rule-based pre-ordering approaches designed for Japanese-to-English translation and with a language independent statistical preordering approach on a standard patent dataset and on a news dataset obtained by crawling Internet news sites. Two state-of-the-art statistical machine translation systems, one phrase-based and the other hierarchical phrase-based, are used in experiments. Experimental results show that the proposed approach outperforms the compared approaches on automatic reordering measures (Kendall's tau, Spearman's rho, fuzzy reordering score, and test set RIBES) and on the automatic translation precision measure of test set BLEU score.
引用
收藏
页数:28
相关论文
共 39 条
[1]  
Al-Onaizan Y, 2006, COLING/ACL 2006, VOLS 1 AND 2, PROCEEDINGS OF THE CONFERENCE, P529
[2]  
[Anonymous], P NTCIR
[3]  
[Anonymous], 2007, MT SUMMIT
[4]  
[Anonymous], 2004, EMNLP
[5]  
[Anonymous], P 50 ANN M ASS COMP
[6]  
[Anonymous], 2005, P MACH TRANSL SUMM 1, P79
[7]  
[Anonymous], ACM T ASIAN LANGUAGE
[8]  
[Anonymous], 2011, P 5 INT JOINT C NATU
[9]  
[Anonymous], P IWSLT
[10]  
[Anonymous], 2013, P 4 WORKSH STAT PARS