Evaluating Resource-Lean Cross-Lingual Embedding Models in Unsupervised Retrieval

被引:11
|
作者
Litschko, Robert [1 ]
Glavas, Goran [1 ]
Vulic, Ivan [2 ]
Dietz, Laura [3 ]
机构
[1] Univ Mannheim, Mannheim, Germany
[2] Univ Cambridge, Cambridge, England
[3] Univ New Hampshire, Durham, NH 03824 USA
来源
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19) | 2019年
关键词
Cross-Lingual IR; Cross-Lingual Embeddings; CLIR Evaluation;
D O I
10.1145/3331184.3331324
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Cross-lingual embeddings (CLE) facilitate cross-lingual natural language processing and information retrieval. Recently, a wide variety of resource-lean projection-based models for inducing CLEs has been introduced, requiring limited or no bilingual supervision. Despite potential usefulness in downstream IR and NLP tasks, these CLE models have almost exclusively been evaluated on word translation tasks. In this work, we provide a comprehensive comparative evaluation of projection-based CLE models for both sentence-level and document-level cross-lingual Information Retrieval (CLIR). We show that in some settings resource-lean CLE-based CLIR models may outperform resource-intensive models using full-blown machine translation (MT). We hope our work serves as a guideline for choosing the right model for CLIR practitioners.
引用
收藏
页码:1109 / 1112
页数:4
相关论文
共 7 条
  • [1] On cross-lingual retrieval with multilingual text encoders
    Litschko, Robert
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    INFORMATION RETRIEVAL JOURNAL, 2022, 25 (02): : 149 - 183
  • [2] Query by Example for Cross-Lingual Event Retrieval
    Sarwar, Sheikh Muhammad
    Allan, James
    PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, : 1601 - 1604
  • [3] On cross-lingual retrieval with multilingual text encoders
    Robert Litschko
    Ivan Vulić
    Simone Paolo Ponzetto
    Goran Glavaš
    Information Retrieval Journal, 2022, 25 : 149 - 183
  • [4] Cross-lingual embeddings with auxiliary topic models
    Zhou, Dong
    Peng, Xiaoya
    Li, Lin
    Han, Jun-mei
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 190
  • [6] Cross-lingual transfer of abstractive summarizer to less-resource language
    Aleš Žagar
    Marko Robnik-Šikonja
    Journal of Intelligent Information Systems, 2022, 58 : 153 - 173
  • [7] Cross-lingual transfer of abstractive summarizer to less-resource language
    Zagar, Ales
    Robnik-Sikonja, Marko
    JOURNAL OF INTELLIGENT INFORMATION SYSTEMS, 2022, 58 (01) : 153 - 173