A systematic comparison of various statistical alignment models

被引：73

作者：

Och, FJ

Ney, H

机构：

[1] Univ So Calif, Inst Informat Sci, Marina Del Rey, CA 90292 USA

[2] RWTH Aachen Univ Technol, Dept Comp Sci, Lehrstuhl Informat 6, D-52056 Aachen, Germany

来源：

COMPUTATIONAL LINGUISTICS | 2003年 / 29卷 / 01期

关键词：

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We present and compare various methods for computing word alignments using statistical or heuristic models. We consider the five alignment models presented in Brown, Della Pietra, Della Pietra, and Mercer (1993), the hidden Markov alignment model, smoothing techniques, and refinements. These statistical models are compared with two heuristic models based on the Dice coefficient. We present different methods for combining word alignments to performa symmetrization of directed statistical alignment models. As evaluation criterion, we use the quality of the resulting Viterbi alignment compared to a manually produced reference alignment. We evaluate the models on the German-English Verbmobil task and the French-English Hansards task. We perform a detailed analysis of various design decisions of our statistical alignment system and evaluate these on training corpora of various sizes. An important result is that refined alignment models with a first-order dependence and a fertility model yield significantly better results than simple heuristic models. In the Appendix, we present an efficient training algorithm for the alignment models presented.

引用

页码：c / 51

页数：33

共 36 条

[1] ALONAIZAN Y, 1999, JHU WORKSH
[2] ALSHAWI H, 1998, COLING ACL 98 36 ANN, V1, P41
[3] [Anonymous], P WORKSH HUM LANG TE
[4] [Anonymous], P HUM LANG TECHN C
[5] [Anonymous], 1993, EUR C SPEECH COMM TE
[6] Baum L.E., 1972, Inequalities III: Proceedings of the Third Symposium on Inequalities, page, V3, P1
[7] BERGER AL, 1994, P ARPA WORKSH HUM LA, P157
[8] Brown P. F., 1993, Computational Linguistics, V19, P263
[9] Brown RD, 1997, P 7 INT C THEOR METH, P111
[10] Dagan I., 1993, P WORKSH VER LARG CO, P1

← 1 2 3 4 →