Towards Better Word Alignment in Transformer

被引:5
作者
Song, Kai [1 ,2 ]
Zhou, Xiaoqing [1 ]
Yu, Heng [3 ]
Huang, Zhongqiang [3 ]
Zhang, Yue [4 ]
Luo, Weihua [3 ]
Duan, Xiangyu [1 ]
Zhang, Min [1 ]
机构
[1] Soochow Univ, Suzhou 215000, Peoples R China
[2] DAMO Acad, Hangzhou 310051, Peoples R China
[3] Alibaba DAMO Acad, Hangzhou 310051, Peoples R China
[4] Westlake Univ, Hangzhou 310000, Peoples R China
基金
中国国家自然科学基金;
关键词
Decoding; Data models; Training; Context modeling; Standards; Speech processing; Error analysis; Neural network; neural machine translation; Transformer; word alignment; language model pre-training; alignment concentration;
D O I
10.1109/TASLP.2020.2998278
中图分类号
O42 [声学];
学科分类号
070206 ; 082403 ;
摘要
While neural models based on the Transformer architecture achieve the State-of-the-Art translation performance, it is well known that the learned target-to-source attentions do not correlate well with word alignment. There is an increasing interest in inducing accurate word alignment in Transformer, due to its important role in practical applications such as dictionary-guided translation and interactive translation. In this article, we extend and improve the recent work on unsupervised learning of word alignment in Transformer on two dimensions: a) parameter initialization from a pre-trained cross-lingual language model to leverage large amounts of monolingual data for learning robust contextualized word representations, and b) regularization of the training objective to directly model characteristics of word alignments which results in favorable word alignments receiving more concentrated probabilities. Experiments on benchmark data sets of three language pairs show that the proposed methods can significantly reduce alignment error rate (AER) by at least 3.7 to 7.7 points on each language pair over two recent works on improving the Transformer's word alignment. Moreover, our methods can achieve better alignment results than GIZA++ on certain test sets.
引用
收藏
页码:1801 / 1812
页数:12
相关论文
共 50 条
[31]   A word alignment model based on multiobjective evolutionary algorithms [J].
Chen, Yidong ;
Shi, Xiaodong ;
Zhou, Changle ;
Hong, Qingyang .
COMPUTERS & MATHEMATICS WITH APPLICATIONS, 2009, 57 (11-12) :1724-1729
[32]   Hapax Legomena: Their Contribution in Number and Efficiency to Word Alignment [J].
Lardilleux, Adrien ;
Lepage, Yves .
HUMAN LANGUAGE TECHNOLOGY: CHALLENGES OF THE INFORMATION SOCIETY, 2009, 5603 :440-450
[33]   Unsupervised Joint Monolingual Character Alignment and Word Segmentation [J].
Teng, Zhiyang ;
Xiong, Hao ;
Liu, Qun .
CHINESE COMPUTATIONAL LINGUISTICS AND NATURAL LANGUAGE PROCESSING BASED ON NATURALLY ANNOTATED BIG DATA, CCL 2014, 2014, 8801 :1-12
[34]   HMM word and phrase alignment for statistical machine translation [J].
Deng, Yonggang ;
Byrne, William .
IEEE TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING, 2008, 16 (03) :494-507
[35]   Temporal Multimodal Graph Transformer With Global-Local Alignment for Video-Text Retrieval [J].
Feng, Zerun ;
Zeng, Zhimin ;
Guo, Caili ;
Li, Zheng .
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2023, 33 (03) :1438-1453
[36]   Towards Better Cephalometric Landmark Detection With Diffusion Data Generation [J].
Guo, Dongqian ;
Han, Wencheng ;
Lyu, Pang ;
Zhou, Yuxi ;
Shen, Jianbing .
IEEE TRANSACTIONS ON MEDICAL IMAGING, 2025, 44 (07) :2784-2794
[37]   Improving Stylized Image Captioning with Better Use of Transformer [J].
Tan, Yutong ;
Lin, Zheng ;
Liu, Huan ;
Zuo, Fan .
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING - ICANN 2022, PT III, 2022, 13531 :347-358
[38]   GMFAD: Towards Generalized Visual Recognition via Multilayer Feature Alignment and Disentanglement [J].
Li, Haoliang ;
Wang, Shiqi ;
Wan, Renjie ;
Kot, Alex C. .
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2022, 44 (03) :1289-1303
[39]   W-core Transformer Model for Chinese Word Segmentation [J].
Lin, Hai ;
Yang, Lina ;
Wang, Patrick Shen-Pei .
TRENDS AND APPLICATIONS IN INFORMATION SYSTEMS AND TECHNOLOGIES, VOL 1, 2021, 1365 :270-280
[40]   Lexicalized Syntactic Reordering Framework for Word Alignment and Machine Translation [J].
Huang, Chung-chi ;
Chen, Wei-teh ;
Chang, Jason S. .
COMPUTER PROCESSING OF ORIENTAL LANGUAGES: LANGUAGE TECHNOLOGY FOR THE KNOWLEDGE-BASED ECONOMY, 2009, 5459 :103-111