WEBMT:: Developing and validating an example-based machine translation system using the world wide Web

被引:14
作者
Way, A [1 ]
Gough, N [1 ]
机构
[1] Dublin City Univ, Sch Comp, Dublin 9, Ireland
关键词
D O I
10.1162/089120103322711596
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We have developed an example-based machine translation (EBMT) system that uses the World Wide Web for two different purposes: First, we populate the system's memory with translations gathered from rule-based MT systems located on the Web. The source strings input to these systems were extracted automatically from an extremely small subset of the rule types in the Penn-II Treebank. In subsequent stages, the (source, target) translation pairs obtained are automatically transformed into a series of resources that render the translation process more successful. Despite the fact that the output from on-line MT systems is often faulty, we demonstrate in a number of experiments that when used to seed the memories of an EBMT system, they can in fact prove useful in generating translations of high quality in a robust fashion. In addition, we demonstrate the relative gain of EBMT in comparison to on-line systems. Second, despite the perception that the documents available on the Web are of questionable quality, we demonstrate in contrast that such resources are extremely useful in automatically postediting translation candidates proposed by our system.
引用
收藏
页码:421 / 457
页数:37
相关论文
共 42 条
[1]  
AHRENBERG L, 2002, P 3 INT C LANG RES E, P485
[2]  
[Anonymous], P 1975 WORKSH THEOR
[3]  
Block HU, 2000, ART INTEL, P411
[4]  
Bod Rens., 2003, Data-Oriented Parsing
[5]  
BOUTSIS S, 1998, P 3 C EMP METH NAT L, P17
[6]  
Brown Ralf D., 2000, 18 INT C COMP LING C, P125
[7]  
Brown RD, 2003, TEXT SPEECH LANG TEC, V21, P287
[8]  
Carl M, 2002, LECT NOTES ARTIF INT, V2499, P11
[9]  
CARL M, 1999, MACHINE TRANSLATION, V7, P250
[10]  
Carl M., 2003, Recent Advances in Example-Based Machine Translation