Farsi - German statistical machine translation through bridge language

被引:2
作者
Bakhshaei S. [1 ]
Khadivi S. [2 ]
Riahi N. [1 ]
机构
[1] Department of Computer and Technology, Alzahra University, Tehran
[2] Department of Computer and Engineering, Amirkabir University of Technology, Tehran
来源
2010 5th International Symposium on Telecommunications, IST 2010 | 2010年
关键词
Bridge language; Comunication tools; English; Farsi-German; Pivot language; SMT; Statistical machine translation;
D O I
10.1109/ISTEL.2010.5734087
中图分类号
学科分类号
摘要
Since statistical machine translation outperforms other approaches in the field of machine translation, we used this approach to develop a Farsi-German and a German-Farsi translation system. In this work, first we used an existing English-German bilingual corpus and then we manually translate a large part of the English corpus to Farsi, in order to have the required training data. Thereafter, we employ the idea of using English as the bridge language to build the Farsi-German translation systems, in addition to build the systems on the corresponding Farsi-German bilingual corpus. Because different amount of recourses exist for Farsi-English and German-English language pairs, we investigate different approaches to combine these two machine translation systems. We will show that the BLEU score of Farsi-German system has increased about 15% relatively compared to the baseline system. © 2010 IEEE.
引用
收藏
页码:557 / 561
页数:4
相关论文
共 13 条
[1]  
Wang H., Wu H., Liu Z., Word alignment for languages with scarce resources using bilingual corpora of other language pairs, Proc. of the COLING/ACL. Poster Sessions, pp. 874-881, (2006)
[2]  
Gollins T., Sanderson M., Improving cross language information retrieval with triangulated translation, SIGIR Forum (ACM Special Interest Group on Information Retrieval), pp. 90-95, (2001)
[3]  
De Gispert A., Mario J.B., Catalan-english statistical machine translation without parallel corpus: Bridging through spanish, Proc. of 5th International Conference on Language Resources and Evaluation (LREC), (2006)
[4]  
Bertoldi N., Barbaiani M., Federico M., Cattoni R., Phrase-Based Statistical Machine Translation with Pivot Languages
[5]  
Kumar S., Och F., Macherey W., Improving word alignment with bridge languages, Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, (2007)
[6]  
Ney H., Och F.J., Vogel S., Statistical translation of spoken dialogues in the verbmobil system, Workshop Multi-Lingual Speech Communication 2000, pp. 69-74, (2000)
[7]  
Narayanan S., Et al., Transonics: A Speech to Speech System for English-Farsi Interactions, (2003)
[8]  
Koehn P., Et al., Moses: Open source toolkit for statistical machine translation, Proc. of the 45th Annual Meeting of the Association for Computational Linguistics. Demo and Poster Sessions, pp. 177-180, (2007)
[9]  
Federico M., Cettolo M., Efficient handling of n-gram language models for statistical machine translation, Proc. of the 2nd Workshop on Statistical Machine Translation, pp. 88-95, (2007)
[10]  
Bach N., Et al., The CMU TransTac 2007 Eyes-free and Hands-freeTwoway Speech-to-Speech Translation System, (2007)