Towards Better Word Alignment in Transformer

被引:5
|
作者
Song, Kai [1 ,2 ]
Zhou, Xiaoqing [1 ]
Yu, Heng [3 ]
Huang, Zhongqiang [3 ]
Zhang, Yue [4 ]
Luo, Weihua [3 ]
Duan, Xiangyu [1 ]
Zhang, Min [1 ]
机构
[1] Soochow Univ, Suzhou 215000, Peoples R China
[2] DAMO Acad, Hangzhou 310051, Peoples R China
[3] Alibaba DAMO Acad, Hangzhou 310051, Peoples R China
[4] Westlake Univ, Hangzhou 310000, Peoples R China
基金
中国国家自然科学基金;
关键词
Decoding; Data models; Training; Context modeling; Standards; Speech processing; Error analysis; Neural network; neural machine translation; Transformer; word alignment; language model pre-training; alignment concentration;
D O I
10.1109/TASLP.2020.2998278
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While neural models based on the Transformer architecture achieve the State-of-the-Art translation performance, it is well known that the learned target-to-source attentions do not correlate well with word alignment. There is an increasing interest in inducing accurate word alignment in Transformer, due to its important role in practical applications such as dictionary-guided translation and interactive translation. In this article, we extend and improve the recent work on unsupervised learning of word alignment in Transformer on two dimensions: a) parameter initialization from a pre-trained cross-lingual language model to leverage large amounts of monolingual data for learning robust contextualized word representations, and b) regularization of the training objective to directly model characteristics of word alignments which results in favorable word alignments receiving more concentrated probabilities. Experiments on benchmark data sets of three language pairs show that the proposed methods can significantly reduce alignment error rate (AER) by at least 3.7 to 7.7 points on each language pair over two recent works on improving the Transformer's word alignment. Moreover, our methods can achieve better alignment results than GIZA++ on certain test sets.
引用
收藏
页码:1801 / 1812
页数:12
相关论文
共 50 条
  • [1] Transformer Machine Translation Model Incorporating Word Alignment Structure
    Xi, Haixu
    Zhang, Feng
    Wang, Yintong
    IEEE TRANSACTIONS ON CONSUMER ELECTRONICS, 2024, 70 (01) : 1010 - 1019
  • [2] Word Alignment Based Transformer Model for XML Structured Documentation Translation
    An, Jing
    Tang, Yecheng
    Bai, Yanbing
    Li, Jiyi
    DATABASE AND EXPERT SYSTEMS APPLICATIONS, DEXA 2022, PT I, 2022, 13426 : 316 - 322
  • [3] A Smaller and Better Word Embedding for Neural Machine Translation
    Chen, Qi
    IEEE ACCESS, 2023, 11 : 40770 - 40778
  • [4] Simpler Is Better: Re-evaluation of Default Word Alignment Models in Statistical MT
    Fishel, Mark
    PROCEEDINGS OF THE 24TH PACIFIC ASIA CONFERENCE ON LANGUAGE, INFORMATION AND COMPUTATION, 2010, : 381 - 388
  • [5] Guidelines for word alignment evaluation and manual alignment
    Lambert, Patrik
    De Gispert, Adria
    Banchs, Rafael
    Marino, Jose B.
    LANGUAGE RESOURCES AND EVALUATION, 2005, 39 (04) : 267 - 285
  • [6] Guidelines for Word Alignment Evaluation and Manual Alignment
    Patrik Lambert
    Adrià De Gispert
    Rafael Banchs
    José B. Mariño
    Language Resources and Evaluation, 2005, 39 : 267 - 285
  • [7] P-Transformer: Towards Better Document-to-Document Neural Machine Translation
    Li, Yachao
    Li, Junhui
    Jiang, Jing
    Tao, Shimin
    Yang, Hao
    Zhang, Min
    IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2023, 31 : 3859 - 3870
  • [8] On Complex Word Alignment Configurations
    Kaeshammer, Miriam
    Westburg, Anika
    LREC 2014 - NINTH INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2014, : 1773 - 1780
  • [9] Discriminative Word Alignment over Multiple Word Segmentations
    XI Ning
    DAI Xinyu
    HUANG Shujian
    CHEN Jiajun
    ChineseJournalofElectronics, 2014, 23 (02) : 263 - 270
  • [10] Discriminative Word Alignment over Multiple Word Segmentations
    Xi Ning
    Dai Xinyu
    Huang Shujian
    Chen Jiajun
    CHINESE JOURNAL OF ELECTRONICS, 2014, 23 (02) : 263 - 270