Minimum description length inference of phrase-based translation models

被引:0
作者
Jesús González-Rubio
Francisco Casacuberta
机构
[1] Universitat Politècnica de València,Pattern Recognition and Human Language Technology Center
[2] WebInterpret.com,undefined
来源
Neural Computing and Applications | 2017年 / 28卷
关键词
Minimum description length; Statistical machine translation; Phrase-based translation models;
D O I
暂无
中图分类号
学科分类号
摘要
This work explores the application of minimum description length (MDL) inference to estimate the parameters of phrase-based statistical machine translation (SMT) models. In comparison with current inference techniques that rely on a long decoupled pipeline with multiple heuristic steps, MDL is a well-founded theoretically sound approach whose empirical results are however below those of the heuristically motivated state-of-the-art training pipeline. We identify potential limitations of MDK inference when applied to natural language and propose practical approaches to overcome them when inferring SMT models. The evaluation in a Spanish-to-English translation task demonstrates that MDL inference can be adapted to yield a performance close to the state of the art.
引用
收藏
页码:2403 / 2413
页数:10
相关论文
共 19 条
[1]  
Brown PF(1990)A statistical approach to machine translation Comput Linguist 16 79-85
[2]  
Cocke J(2009)Human interaction for high-quality machine translation Commun ACM 52 135-138
[3]  
Pietra SAD(2003)A systematic comparison of various statistical alignment models Comput Linguist 29 19-51
[4]  
Pietra VJD(1978)Modeling by shortest data description Automatica 14 465-471
[5]  
Jelinek F(1948)A mathematical theory of communication Bell Syst Tech J 27 379-423
[6]  
Lafferty JD(undefined)undefined undefined undefined undefined-undefined
[7]  
Mercer RL(undefined)undefined undefined undefined undefined-undefined
[8]  
Roossin PS(undefined)undefined undefined undefined undefined-undefined
[9]  
Casacuberta F(undefined)undefined undefined undefined undefined-undefined
[10]  
Civera J(undefined)undefined undefined undefined undefined-undefined