Dense Passage Retrieval for Open-Domain Question Answering

被引:0
作者
Karpukhin, Vladimir [1 ]
Oguz, Barlas [1 ]
Min, Sewon [2 ]
Lewis, Patrick [1 ]
Wu, Ledell [1 ]
Edunov, Sergey [1 ]
Chen, Danqi [3 ]
Yih, Wen Tau [1 ]
机构
[1] Facebook AI, London, England
[2] Univ Washington, Seattle, WA USA
[3] Princeton Univ, Princeton, NJ USA
来源
PROCEEDINGS OF THE 2020 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP) | 2020年
关键词
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system greatly by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.(1)
引用
收藏
页码:6769 / 6781
页数:13
相关论文
共 50 条
[31]   SPARTA: Efficient Open-Domain Question Answering via Sparse Transformer Matching Retrieval [J].
Zhao, Tiancheng ;
Lu, Xiaopeng ;
Lee, Kyusong .
2021 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL-HLT 2021), 2021, :565-575
[32]   PyGaggle: A Gaggle of Resources for Open-Domain Question Answering [J].
Pradeep, Ronak ;
Chen, Haonan ;
Gu, Lingwei ;
Tamber, Manveer Singh ;
Lin, Jimmy .
ADVANCES IN INFORMATION RETRIEVAL, ECIR 2023, PT III, 2023, 13982 :148-162
[33]   Adaptive Information Seeking for Open-Domain Question Answering [J].
Zhu, Yunchang ;
Pang, Liang ;
Lan, Yanyan ;
Shen, Huawei ;
Cheng, Xueqi .
2021 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING (EMNLP 2021), 2021, :3615-3626
[34]   ODSQA: OPEN-DOMAIN SPOKEN QUESTION ANSWERING DATASET [J].
Lee, Chia-Hsuan ;
Wang, Shang-Ming ;
Chang, Huan-Cheng ;
Lee, Hung-Yi .
2018 IEEE WORKSHOP ON SPOKEN LANGUAGE TECHNOLOGY (SLT 2018), 2018, :949-956
[35]   RRQA: reconfirmed reader for open-domain question answering [J].
Li, Shi ;
Zhang, Wenqian .
APPLIED INTELLIGENCE, 2023, 53 (15) :18420-18430
[36]   Query Context Expansion for Open-Domain Question Answering [J].
Zhu, Wenhao ;
Zhang, Xiaoyu ;
Ye, Liang ;
Zhai, Qiuhong .
ACM TRANSACTIONS ON ASIAN AND LOW-RESOURCE LANGUAGE INFORMATION PROCESSING, 2023, 22 (08)
[37]   Adaptive Batch Scheduling for Open-Domain Question Answering [J].
Choi, Donghyun ;
Shin, Myeongcheol ;
Kim, Eunggyun ;
Shin, Dong Ryeol .
IEEE ACCESS, 2021, 9 :112097-112103
[38]   RRQA: reconfirmed reader for open-domain question answering [J].
Shi Li ;
Wenqian Zhang .
Applied Intelligence, 2023, 53 :18420-18430
[39]   Designing an interactive open-domain question answering system [J].
Quarteroni, S. ;
Manandhar, S. .
NATURAL LANGUAGE ENGINEERING, 2009, 15 :73-95
[40]   Document Gated Reader for Open-Domain Question Answering [J].
Wang, Bingning ;
Yao, Ting ;
Zhang, Qi ;
Xu, Jingfang ;
Tian, Zhixing ;
Liu, Kang ;
Zhao, Jun .
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, :85-94