Improving BERT-based Query-by-Document Retrieval with Multi-task Optimization

被引:14
作者
Abolghasemi, Amin [1 ]
Verberne, Suzan [1 ]
Azzopardi, Leif [2 ]
机构
[1] Leiden Univ, Leiden, Netherlands
[2] Univ Strathclyde, Glasgow, Lanark, Scotland
来源
ADVANCES IN INFORMATION RETRIEVAL, PT II | 2022年 / 13186卷
关键词
Query-by-document retrieval; BERT-based ranking; Multi-task optimization;
D O I
10.1007/978-3-030-99739-7_1
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Query-by-document (QBD) retrieval is an Information Retrieval task in which a seed document acts as the query and the goal is to retrieve related documents - it is particular common in professional search tasks. In this work we improve the retrieval effectiveness of the BERT re-ranker, proposing an extension to its fine-tuning step to better exploit the context of queries. To this end, we use an additional document-level representation learning objective besides the ranking objective when fine-tuning the BERT re-ranker. Our experiments on two QBD retrieval benchmarks show that the proposed multi-task optimization significantly improves the ranking effectiveness without changing the BERT re-ranker or using additional training samples. In future work, the generalizability of our approach to other retrieval tasks should be further investigated.
引用
收藏
页码:3 / 12
页数:10
相关论文
共 38 条
[1]  
Ahmad W.U., 2018, INT C LEARN REPR
[2]  
Althammer S., 2022, PROC ADV INF RETR 44, P1
[3]  
Askari Arian, 2021, DESIRES
[4]  
Beltagy I, 2019, 2019 CONFERENCE ON EMPIRICAL METHODS IN NATURAL LANGUAGE PROCESSING AND THE 9TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (EMNLP-IJCNLP 2019), P3615
[5]  
Burges C., 2005, P 22 INT C MACH LEAR, P89
[6]  
Cao Z., 2007, P 24 INT C MACHINE L, P129
[7]  
Chalkidis I, 2020, M ASS FOR COMPUTATIO
[8]   Long Short-Term Session Search: Joint Personalized Reranking and Next Query Prediction [J].
Cheng, Qiannan ;
Ren, Zhaochun ;
Lin, Yujie ;
Ren, Pengjie ;
Chen, Zhumin ;
Liu, Xiangyuan ;
de Rijke, Maarten .
PROCEEDINGS OF THE WORLD WIDE WEB CONFERENCE 2021 (WWW 2021), 2021, :239-248
[9]  
Cohan A, 2020, 58TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2020), P2270
[10]   Context-Aware Term Weighting For First Stage Passage Retrieval [J].
Dai, Zhuyun ;
Callan, Jamie .
PROCEEDINGS OF THE 43RD INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '20), 2020, :1533-1536