A NOVEL APPROACH TO MACHINE TRANSLATION: A PROPOSED LANGUAGE-INDEPENDENT SYSTEM BASED ON DEDUCTIVE SCHEMES

被引:0
作者
Fakhrahmad, S. M. [1 ]
Rezapour, A. R. [1 ]
Sadreddini, M. H. [1 ]
Jahromi, M. Zolghadri [1 ]
机构
[1] Shiraz Univ, Sch Elect & Elect Engn, Dept Comp Sci & Engn, Shiraz, Iran
关键词
Machine translation; example-based; rule-based; corpora-based; finite automata; grammar induction;
D O I
暂无
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Compared to corpora-based machine translation methods, rule-based methods have deficiencies, which make them unattractive for the researchers of this field. The first problem is that these methods are language dependent. Rule-based methods require the syntactic information about source and target languages. On the other hand, in many cases, especially for proverbs and specific expressions, syntactic rules are no longer useful. In such cases, the use of example-based approaches is inevitable. In this work, we propose and integrate a set of novel schemes to introduce a new translation system, called BORNA. First a grammar induction method based on the Expectation Maximization (EM) algorithm is proposed. After representing the extracted knowledge in the form of a set of nested finite automata, a recursive model is proposed, which uses a combination of rule and example based techniques. In the translation phase, through a hierarchical chunking process, the input sentence is divided into a set of phrases. Each phrase is searched in the corpus of examples. If the phrase is found, it will not be chunked anymore. Otherwise, the phrase is divided into smaller sub-phrases. The simulation results show that BORNA outperforms its counterparts, significantly. Compared to PARS, Frengly and Google translators, BORNA receives the highest Bleu scores for its translations, while it results in the minimum values for different error measures, including PER, TER and WER.
引用
收藏
页码:59 / 72
页数:14
相关论文
共 31 条
[1]  
Abdul-Rauf S., 2009, Proceedings of the Conference of the European Chapter of the Association for Computational Linguistics, P16
[2]  
[Anonymous], 2011, COMPUTATIONAL LINGUI, V33, P201
[3]  
[Anonymous], P 4 INT WORKSH NAT L
[4]  
[Anonymous], 1997, Fifth European Conf. on Speech Communication and Technology
[5]  
[Anonymous], RES WORKSH ISR SCI F
[6]  
[Anonymous], 2002, Proceedings of the Second International Conference on Human Language Technology Research, DOI DOI 10.5555/1289189.1289273
[7]  
[Anonymous], 1996, The EM Algorithm and Extensions
[8]  
Banko M., 2004, P INT C COMP LING CO, P164
[9]  
Bicici E., 2013, P 8 WORKSH STAT MACH, P76
[10]  
Callison-Burch C, 2011, P 6 WORKSHOP STAT MA, P22