Adversarial Training for Unsupervised Bilingual Lexicon Induction

被引：158

作者：

Zhang, Meng ^{[1
,2
]}

Liu, Yang ^{[1
,2
]}

Luan, Huanbo ^{[1
]}

Sun, Maosong ^{[1
,2
]}

机构：

[1] Tsinghua Univ, Dept Comp Sci & Technol, Tsinghua Natl Lab Informat Sci & Technol, State Key Lab Intelligent Technol & Syst, Beijing, Peoples R China

[2] Jiangsu Collaborat Innovat Ctr Language Competenc, Xuzhou, Jiangsu, Peoples R China

来源：

PROCEEDINGS OF THE 55TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2017), VOL 1 | 2017年

基金：

新加坡国家研究基金会; 中国国家自然科学基金;

关键词：

D O I：

10.18653/v1/P17-1179

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Word embeddings are well known to capture linguistic regularities of the language on which they are trained. Researchers also observe that these regularities can transfer across languages. However, previous endeavors to connect separate monolingual word embeddings typically require cross-lingual signals as supervision, either in the form of parallel corpus or seed lexicon. In this work, we show that such cross-lingual connection can actually be established without any form of supervision. We achieve this end by formulating the problem as a natural adversarial game, and investigating techniques that are crucial to successful training. We carry out evaluation on the unsupervised bilingual lexicon induction task. Even though this task appears intrinsically cross-lingual, we are able to demonstrate encouraging performance without any cross-lingual clues.

引用

页码：1959 / 1970

页数：12