Adversarial Training for Unsupervised Bilingual Lexicon Induction

被引:158
作者
Zhang, Meng [1 ,2 ]
Liu, Yang [1 ,2 ]
Luan, Huanbo [1 ]
Sun, Maosong [1 ,2 ]
机构
[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing, Peoples R China
[2] Jiangsu Collaborat Innovat Ctr Language Competenc, Xuzhou, Jiangsu, Peoples R China
来源
PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1 | 2017年
基金
新加坡国家研究基金会; 中国国家自然科学基金;
关键词
D O I
10.18653/v1/P17-1179
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Word embeddings are well known to capture linguistic regularities of the language on which they are trained. Researchers also observe that these regularities can transfer across languages. However, previous endeavors to connect separate monolingual word embeddings typically require cross-lingual signals as supervision, either in the form of parallel corpus or seed lexicon. In this work, we show that such cross-lingual connection can actually be established without any form of supervision. We achieve this end by formulating the problem as a natural adversarial game, and investigating techniques that are crucial to successful training. We carry out evaluation on the unsupervised bilingual lexicon induction task. Even though this task appears intrinsically cross-lingual, we are able to demonstrate encouraging performance without any cross-lingual clues.
引用
收藏
页码:1959 / 1970
页数:12
相关论文
共 61 条
[1]  
[Anonymous], NAACL HLT
[2]  
[Anonymous], ARXIV160201925CS
[3]  
[Anonymous], ARXIV161200188
[4]  
[Anonymous], ICLR
[5]  
[Anonymous], 2004, P 42 ANN M ASS COMPU
[6]  
[Anonymous], EACL
[7]  
[Anonymous], 2017, ICLR, DOI DOI 10.48550/ARXIV.1701.04862
[8]  
[Anonymous], ACL HLT
[9]  
[Anonymous], ICML
[10]  
[Anonymous], 2016, EMNLP