Towards Better Word Alignment in Transformer

被引:5
作者
Song, Kai [1 ,2 ]
Zhou, Xiaoqing [1 ]
Yu, Heng [3 ]
Huang, Zhongqiang [3 ]
Zhang, Yue [4 ]
Luo, Weihua [3 ]
Duan, Xiangyu [1 ]
Zhang, Min [1 ]
机构
[1] Soochow Univ, Suzhou 215000, Peoples R China
[2] DAMO Acad, Hangzhou 310051, Peoples R China
[3] Alibaba DAMO Acad, Hangzhou 310051, Peoples R China
[4] Westlake Univ, Hangzhou 310000, Peoples R China
基金
中国国家自然科学基金;
关键词
Decoding; Data models; Training; Context modeling; Standards; Speech processing; Error analysis; Neural network; neural machine translation; Transformer; word alignment; language model pre-training; alignment concentration;
D O I
10.1109/TASLP.2020.2998278
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While neural models based on the Transformer architecture achieve the State-of-the-Art translation performance, it is well known that the learned target-to-source attentions do not correlate well with word alignment. There is an increasing interest in inducing accurate word alignment in Transformer, due to its important role in practical applications such as dictionary-guided translation and interactive translation. In this article, we extend and improve the recent work on unsupervised learning of word alignment in Transformer on two dimensions: a) parameter initialization from a pre-trained cross-lingual language model to leverage large amounts of monolingual data for learning robust contextualized word representations, and b) regularization of the training objective to directly model characteristics of word alignments which results in favorable word alignments receiving more concentrated probabilities. Experiments on benchmark data sets of three language pairs show that the proposed methods can significantly reduce alignment error rate (AER) by at least 3.7 to 7.7 points on each language pair over two recent works on improving the Transformer's word alignment. Moreover, our methods can achieve better alignment results than GIZA++ on certain test sets.
引用
收藏
页码:1801 / 1812
页数:12
相关论文
共 50 条
  • [21] AbFTNet: An Efficient Transformer Network with Alignment before Fusion for Multimodal Automatic Modulation Recognition
    Ning, Meng
    Zhou, Fan
    Wang, Wei
    Wang, Shaoqiang
    Zhang, Peiying
    Wang, Jian
    ELECTRONICS, 2024, 13 (18)
  • [22] Transformer Fuse Sizing-The NEC is not the Last Word
    Ventruella, Del John
    IEEE TRANSACTIONS ON INDUSTRY APPLICATIONS, 2019, 55 (02) : 2173 - 2180
  • [23] POS-based Word Alignment for Small Corpus
    Srivastava, Jyoti
    Sanyal, Sudip
    PROCEEDINGS OF 2015 INTERNATIONAL CONFERENCE ON ASIAN LANGUAGE PROCESSING, 2015, : 37 - 40
  • [24] Linguistics-based word alignment for medical translators
    Vanallemeersch, Tom
    Wermuth, Cornelia
    JOURNAL OF SPECIALISED TRANSLATION, 2008, (09) : 20 - 38
  • [25] Bootstrapping Word Alignment by automatically Generated Bilingual Dictionary
    Zhu, Danqing
    Chang, Baobao
    IEEE NLP-KE 2008: PROCEEDINGS OF INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND KNOWLEDGE ENGINEERING, 2008, : 311 - 317
  • [26] Constraining a Generative Word Alignment Model with Discriminative Output
    Goh, Chooi-Ling
    Watanabe, Taro
    Yamamoto, Hirofumi
    Sumita, Eiichiro
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2010, E93D (07) : 1976 - 1983
  • [27] WORD ALIGNMENT BASED ON MULTI-GRAIN MODEL
    He, Yanqing
    Zhou, Yu
    Zong, Chengqing
    2008 6TH INTERNATIONAL SYMPOSIUM ON CHINESE SPOKEN LANGUAGE PROCESSING, PROCEEDINGS, 2008, : 269 - 272
  • [28] Temporal Multimodal Graph Transformer With Global-Local Alignment for Video-Text Retrieval
    Feng, Zerun
    Zeng, Zhimin
    Guo, Caili
    Li, Zheng
    IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) : 1438 - 1453
  • [29] A Hybrid Approach for Word Alignment with Statistical Modeling and Chunker
    Srivastava, Jyoti
    Sanyal, Sudip
    COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING (CICLING 2015), PT I, 2015, 9041 : 570 - 581
  • [30] Unsupervised joint monolingual character alignment and word segmentation
    Teng, Zhiyang (tengzhiyang@ict.ac.cn), 1600, Springer Verlag (8801): : 1 - 12