Axiomatically Regularized Pre-training for Ad hoc Search

被引:10
作者
Chen, Jia [1 ]
Liu, Yiqun [1 ]
Fang, Yan [1 ]
Mao, Jiaxin [2 ]
Fang, Hui [3 ]
Yang, Shenghao [1 ]
Xie, Xiaohui [1 ]
Zhang, Min [1 ]
Ma, Shaoping [1 ]
机构
[1] Tsinghua Univ, BNRist, DCST, Beijing 100084, Peoples R China
[2] Renmin Univ China, GSAI, Beijing 100872, Peoples R China
[3] Univ Delaware, DECE, Newark, DE USA
来源
PROCEEDINGS OF THE 45TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '22) | 2022年
关键词
Pre-training Method; Pre-trained Language Model; Axiomatic IR; Ad hoc Search;
D O I
10.1145/3477495.3531943
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, pre-training methods tailored for IR tasks have achieved great success. However, as the mechanisms behind the performance improvement remain under-investigated, the interpretability and robustness of these pre-trained models still need to be improved. Axiomatic IR aims to identify a set of desirable properties expressed mathematically as formal constraints to guide the design of ranking models. Existing studies have already shown that considering certain axioms may help improve the effectiveness and interpretability of IR models. However, there still lack efforts of incorporating these IR axioms into pre-training methodologies. To shed light on this research question, we propose a novel pre-training method with Axiomatic Regularization for ad hoc Search (ARES). In the ARES framework, a number of existing IR axioms are re-organized to generate training samples to be fitted in the pre-training process. These training samples then guide neural rankers to learn the desirable ranking properties. Compared to existing pre-training approaches, ARES is more intuitive and explainable. Experimental results on multiple publicly available benchmark datasets have shown the effectiveness of ARES in both full-resource and low-resource (e.g., zero-shot and few-shot) settings. An intuitive case study also indicates that ARES has learned useful knowledge that existing pre-trained models (e.g., BERT and PROP) fail to possess. This work provides insights into improving the interpretability of pre-trained models and the guidance of incorporating IR axioms or human heuristics into pre-training methods.
引用
收藏
页码:1524 / 1534
页数:11
相关论文
共 51 条
[1]   Probabilistic models of information retrieval based on measuring the divergence from randomness [J].
Amati, G ;
Van Rijsbergen, CJ .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2002, 20 (04) :357-389
[2]  
[Anonymous], 2005, ACM SIGIR FORUM
[3]  
[Anonymous], 2014, P 23 ACM INT C C INF, DOI DOI 10.2495/ITIE20131132
[4]  
[Anonymous], 2008, Synthesis lectures on human language technologies, DOI [DOI 10.1007/978-3-031-02130-5, 10.2200/S00158ED1V01Y200811HLT001]
[5]  
Arora Siddhant, 2019, ARXIV190405737
[6]  
Camara Arthur, 2020, ADV INFORM RETRIEVAL, V12035, P605
[7]  
Chang Wei-Cheng, 2020, ARXIV200203932
[8]   A Hybrid Framework for Session Context Modeling [J].
Chen, Jia ;
Mao, Jiaxin ;
Liu, Yiqun ;
Ye, Ziyi ;
Ma, Weizhi ;
Wang, Chao ;
Zhang, Min ;
Ma, Shaoping .
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2021, 39 (03)
[9]  
Chen Lijuan, 2021, ARXIV210807081
[10]   XGBoost: A Scalable Tree Boosting System [J].
Chen, Tianqi ;
Guestrin, Carlos .
KDD'16: PROCEEDINGS OF THE 22ND ACM SIGKDD INTERNATIONAL CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, 2016, :785-794