A Thorough Examination on Zero-shot Dense Retrieval

被引:0
|
作者
Ren, Ruiyang [1 ,3 ]
Qu, Yingqi [2 ]
Liu, Jing [2 ]
Zhao, Wayne Xin [1 ,3 ]
Wu, Qifei [2 ]
Ding, Yuchen [2 ]
Wu, Hua [2 ]
Wang, Haifeng [2 ]
Wen, Ji-Rong [1 ,3 ]
机构
[1] Renmin Univ China, Gaoling Sch Artificial Intelligence, Beijing, Peoples R China
[2] Baidu Inc, Beijing, Peoples R China
[3] Beijing Key Lab Big Data Management & Anal Method, Beijing, Peoples R China
基金
中国国家自然科学基金; 北京市自然科学基金;
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Recent years have witnessed the significant advance in dense retrieval (DR) based on powerful pre-trained language models (PLM). DR models have achieved excellent performance in several benchmark datasets, while they are shown to be not as competitive as traditional sparse retrieval models (e.g., BM25) in a zero-shot retrieval setting. However, in the related literature, there still lacks a detailed and comprehensive study on zero-shot retrieval. In this paper, we present the first thorough examination of the zero-shot capability of DR models. We aim to identify the key factors and analyze how they affect zero-shot retrieval performance. In particular, we discuss the effect of several key factors related to source training set, analyze the potential bias from the target dataset, and review and compare existing zero-shot DR models. Our findings provide important evidence to better understand and develop zero-shot DR models.
引用
收藏
页码:15783 / 15796
页数:14
相关论文
共 50 条
  • [1] Boot and Switch: Alternating Distillation for Zero-Shot Dense Retrieval
    Jiang, Fan
    Xu, Qiongkai
    Drummond, Tom
    Cohn, Trevor
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS - EMNLP 2023, 2023, : 912 - 931
  • [2] Combining Multiple Supervision for Robust Zero-Shot Dense Retrieval
    Fang, Yan
    Ai, Qingyao
    Zhan, Jingtao
    Liu, Yiqun
    Wu, Xiaolong
    Cao, Zhao
    THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 16, 2024, : 17994 - 18002
  • [3] Precise Zero-Shot Dense Retrieval without Relevance Labels
    Gao, Luyu
    Ma, Xueguang
    Lin, Jimmy
    Callan, Jamie
    arXiv, 2022,
  • [4] Precise Zero-Shot Dense Retrieval without Relevance Labels
    Gao, Luyu
    Ma, Xueguang
    Lin, Jimmy
    Callan, Jamie
    PROCEEDINGS OF THE 61ST ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS, ACL 2023, VOL 1, 2023, : 1762 - 1777
  • [5] Improving zero-shot retrieval using dense external expansion
    Wang, Xiao
    Macdonald, Craig
    Ounis, Iadh
    INFORMATION PROCESSING & MANAGEMENT, 2022, 59 (05)
  • [6] Scalable Zero-shot Entity Linking with Dense Entity Retrieval
    Wu, Ledell
    Petroni, Fabio
    Josifoski, Martin
    Riedel, Sebastian
    Zettlemoyer, Luke
    PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP), 2020, : 6397 - 6407
  • [7] LaPraDoR: Unsupervised Pretrained Dense Retriever for Zero-Shot Text Retrieval
    Xu, Canwen
    Guo, Daya
    Duan, Nan
    McAuley, Julian
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 3557 - 3569
  • [8] Zero-Shot Dense Retrieval with Momentum Adversarial Domain Invariant Representations
    Xin, Ji
    Xiong, Chenyan
    Srinivasan, Ashwin
    Sharma, Ankita
    Jose, Damien
    Bennett, Paul N.
    FINDINGS OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022), 2022, : 4008 - 4020
  • [9] Improving Zero-Shot Entity Retrieval through Effective Dense Representations
    Partalidou, Eleni
    Christou, Despina
    Tsoumakas, Grigorios
    PROCEEDINGS OF THE 12TH HELLENIC CONFERENCE ON ARTIFICIAL INTELLIGENCE, SETN 2022, 2022,
  • [10] mAggretriever: A Simple Yet Effective Approach to Zero-Shot Multilingual Dense Retrieval
    Lin, Sheng-Chieh
    Ahmad, Amin
    Lin, Jimmy
    2023 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2023), 2023, : 11688 - 11696