Comparison of Statistical Approaches for Tamil to English Translation

被引:1
作者
Rajkiran, R. [1 ]
Prashanth, S. [1 ]
Keshav, K. Amarnath [1 ]
Rajeswari, Sridhar [1 ]
机构
[1] Anna Univ, Dept Comp Sci & Engn, Coll Engn, Madras 600025, Tamil Nadu, India
来源
COMPUTATIONAL INTELLIGENCE IN DATA MINING, VOL 2 | 2015年 / 32卷
关键词
Statistical machine translation; Syntax based; Hierarchical phrase based; BLEU score;
D O I
10.1007/978-81-322-2208-8_30
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
This work proposes a Machine Translation system from Tamil to English using a Statistical Approach. Statistical machine translation (SMT) is a machine translation paradigm where translations are generated on the basis of statistical models whose parameters are derived from the analysis of bilingual text corpora. It is the most widely used machine translation paradigm for the tradeoff between efficiency and implementation feasibility and due to its partial language independency. In syntax based approach, a phrase table is created which identifies the most probabilistically likely English translation of each Tamil phrase in the input sentence. In hierarchical phrase based approach, a rule table is used to reduce the input Tamil sentence into the output English sentence. We evaluated the two approaches based on different parameters like corpus size, gram size of language model and achieved a BLEU score of 0.26.
引用
收藏
页码:321 / 331
页数:11
相关论文
共 12 条
[1]  
Aleksic V., 2005, US Patent App., Patent No. [11/885,688, 11885688]
[2]  
[Anonymous], 2012, P WORKSHOP MACHINE T
[3]  
[Anonymous], 2010, Statistical Machine Translation
[4]   Apertium: a free/open-source platform for rule-based machine translation [J].
Forcada, Mikel L. ;
Ginesti-Rosell, Mireia ;
Nordfalk, Jacob ;
O'Regan, Jim ;
Ortiz-Rojas, Sergio ;
Antonio Perez-Ortiz, Juan ;
Sanchez-Martinez, Felipe ;
Ramirez-Sanchez, Gema ;
Tyers, Francis M. .
MACHINE TRANSLATION, 2011, 25 (02) :127-144
[5]  
Hogan C, 1998, LECT NOTES ARTIF INT, V1529, P113
[6]  
Koehn P., 2007, MOSES OPEN SOURCE TO
[7]  
Muegge U., 2006, P ANN C GERM SOC TEC, P18
[8]  
Och FJ, 2003, COMPUT LINGUIST, V29, pc, DOI 10.1162/089120103321337421
[9]  
Parthasarathi R., 10 INT TAM INT C INT, P197
[10]  
Post Matt, 2012, P 7 WORKSH STAT MACH, P401