On the Limitations of Unsupervised Bilingual Dictionary Induction

被引:0
|
作者
Sogaard, Anders [1 ]
Ruder, Sebastian [2 ,3 ]
Vulic, Ivan [4 ]
机构
[1] Univ Copenhagen, Copenhagen, Denmark
[2] Natl Univ Ireland, Insight Res Ctr, Galway, Ireland
[3] Aylien Ltd, Dublin, Ireland
[4] Univ Cambridge, Language Technol Lab, Cambridge, England
来源
PROCEEDINGS OF THE 56TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL), VOL 1 | 2018年
基金
爱尔兰科学基金会;
关键词
D O I
暂无
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Unsupervised machine translation-i.e., not assuming any cross-lingual supervision signal, whether a dictionary, translations, or comparable corpora-seems impossible, but nevertheless, Lample et al. (2018a) recently proposed a fully unsupervised machine translation (MT) model. The model relies heavily on an adversarial, unsupervised alignment of word embedding spaces for bilingual dictionary induction (Conneau et al., 2018), which we examine here. Our results identify the limitations of current unsupervised MT: unsupervised bilingual dictionary induction performs much worse on morphologically rich languages that are not dependent marking, when monolingual corpora from different domains or different embedding algorithms are used. We show that a simple trick, exploiting a weak supervision signal from identical words, enables more robust induction, and establish a near-perfect correlation between unsupervised bilingual dictionary induction performance and a previously unexplored graph similarity metric.
引用
收藏
页码:778 / 788
页数:11
相关论文
共 50 条
  • [31] Lexibase Pro Bilingual Dictionary
    Pillet, S
    FRENCH REVIEW, 2004, 77 (06) : 1250 - 1251
  • [32] How is a bilingual dictionary possible?
    不详
    SCANDO-SLAVICA, 2009, 55 (01) : 175 - 186
  • [33] On Information Types in a Bilingual Dictionary
    Karpov, Vladimir, I
    Dobrovol'skij, Dmitrij O.
    Nuriev, Vitaly A.
    VOPROSY LEKSIKOGRAFII-RUSSIAN JOURNAL OF LEXICOGRAPHY, 2019, 16 : 38 - 58
  • [34] The bilingual dictionary: Friend or foe?
    Pastor, GC
    PROCEEDINGS OF THE XIXTH INTERNATIONAL CONFERENCE ON AEDEAN (ASOCIACION ESPANOLA DE ESTUDIOS ANGLONORTEAMERICANOS), 1996, : 201 - 204
  • [35] Adapting a bilingual dictionary to domains
    Kaji, H
    IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2005, E88D (02) : 302 - 312
  • [36] Bilingual securities dictionary.
    Romero, L
    LIBRARY JOURNAL, 2000, 125 (01) : 72 - 72
  • [37] Bilingual Dictionary of Juridical Terminology
    Alcaraz-Varo, Enrique
    QUADERNS-REVISTA DE TRADUCCIO, 2005, 12 : 266 - 267
  • [38] The intercultural dimension of the bilingual dictionary
    Adamska-Salaciak, Arleta
    INTERNATIONAL JOURNAL OF LEXICOGRAPHY, 2018, 31 (04) : 519 - 523
  • [39] Bilingual sign language dictionary
    Fuertes, Jose L.
    Gonzalez, Angel L.
    Mariscal, Gonzalo
    Ruiz, Carlos
    COMPUTERS HELPING PEOPLE WITH SPECIAL NEEDS, PROCEEDINGS, 2006, 4061 : 599 - 606
  • [40] How is the Bilingual Dictionary Possible?
    Yatsenko, Anna
    SLAVIC AND EAST EUROPEAN JOURNAL, 2009, 53 (04) : 721 - 723