mCLIP: Multilingual CLIP via Cross-lingual Transfer

被引:0
|
作者
Chen, Guanhua [1 ]
Hou, Lu [2 ]
Chen, Yun [3 ]
Dai, Wenliang [5 ]
Shang, Lifeng [2 ]
Jiang, Xin [2 ]
Liu, Qun [2 ]
Pan, Jia [4 ]
Wang, Wenping [6 ]
机构
[1] Southern Univ Sci & Technol, Shenzhen, Peoples R China
[2] Huawei Noahs Ark Lab, Montreal, PQ, Canada
[3] Shanghai Univ Finance & Econ, Shanghai, Peoples R China
[4] Univ Hong Kong, Hong Kong, Peoples R China
[5] Hong Kong Univ Sci & Technol, Hong Kong, Peoples R China
[6] Texas A&M Univ, College Stn, TX USA
来源
PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2023): LONG PAPERS, VOL 1 | 2023年
基金
中国国家自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Large-scale vision-language pretrained (VLP) models like CLIP have shown remarkable performance on various downstream cross-modal tasks. However, they are usually biased towards English due to the lack of sufficient non-English image-text pairs. Existing multilingual VLP methods often learn retrieval-inefficient single-stream models by translation-augmented non-English image-text pairs. In this paper, we introduce mCLIP, a retrieval-efficient dual-stream multilingual VLP model, trained by aligning the CLIP model and a Multilingual Text Encoder (MTE) through a novel Triangle Cross-modal Knowledge Distillation (TriKD) method. It is parameter-efficient as only two light projectors on the top of them are updated during distillation. Furthermore, to enhance the token- and sentence-level multilingual representation of the MTE, we propose to train it with machine translation and contrastive learning jointly before the TriKD to provide a better initialization. Empirical results show that mCLIP achieves new state-of-the-art performance for both zero-shot and finetuned multilingual image-text retrieval task.
引用
收藏
页码:13028 / 13043
页数:16
相关论文
共 50 条
  • [1] Cross-lingual and Multilingual CLIP
    Carlsson, Fredrik
    Eisen, Philipp
    Rekathati, Faton
    Sahlgren, Magnus
    LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, 2022, : 6848 - 6854
  • [2] Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer
    Zhao, Jieyu
    Mukherjee, Subhabrata
    Hosseini, Saghar
    Chang, Kai-Wei
    Awadallah, Ahmed Hassan
    58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), 2020, : 2896 - 2907
  • [3] Syntax-augmented Multilingual BERT for Cross-lingual Transfer
    Ahmad, Wasi Uddin
    Li, Haoran
    Chang, Kai-Wei
    Mehdad, Yashar
    59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, 2021, : 4538 - 4554
  • [4] Cross-Lingual Transfer Learning for Multilingual Task Oriented Dialog
    Schuster, Sebastian
    Gupta, Sonal
    Shah, Rushin
    Lewis, Mike
    2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, 2019, : 3795 - 3805
  • [5] Multilingual Pixel Representations for Translation and Effective Cross-lingual Transfer
    Salesky, Elizabeth
    Verma, Neha
    Koehn, Philipp
    Post, Matt
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 13845 - 13861
  • [6] Cross-Lingual Validation of Multilingual Wordnets
    Tufis, Dan
    Ion, Radu
    Barbu, Eduard
    Barbu, Verginica
    GWC 2004: SECOND INTERNATIONAL WORDNET CONFERENCE, PROCEEDINGS, 2003, : 332 - 340
  • [7] Exploring the Relationship between Alignment and Cross-lingual Transfer in Multilingual Transformers
    Gaschi, Felix
    Cerda, Patricio
    Rastin, Parisa
    Toussaint, Yannick
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, 2023, : 3020 - 3042
  • [8] When is BERT Multilingual? Isolating Crucial Ingredients for Cross-lingual Transfer
    Deshpande, Ameet
    Talukdar, Partha
    Narasimhan, Karthik
    NAACL 2022: THE 2022 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES, 2022, : 3610 - 3623
  • [9] Analyzing the Evaluation of Cross-Lingual Knowledge Transfer in Multilingual Language Models
    Rajaee, Sara
    Monz, Christof
    PROCEEDINGS OF THE 18TH CONFERENCE OF THE EUROPEAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, VOL 1: LONG PAPERS, 2024, : 2895 - 2914
  • [10] On cross-lingual retrieval with multilingual text encoders
    Litschko, Robert
    Vulic, Ivan
    Ponzetto, Simone Paolo
    Glavas, Goran
    INFORMATION RETRIEVAL JOURNAL, 2022, 25 (02): : 149 - 183