Not All Relevance Scores are Equal: Efficient Uncertainty and Calibration Modeling for Deep Retrieval Models

被引：11

作者：

Cohen, Daniel ^{[1
]}

Mitra, Bhaskar ^{[2
]}

Lesota, Oleg ^{[3
]}

Rekabsaz, Navid ^{[3
,4
]}

Eickhoff, Carsten ^{[1
]}

机构：

[1] Brown Univ, Providence, RI 02912 USA

[2] Microsoft, Montreal, PQ, Canada

[3] Johannes Kepler Univ Linz, Linz, Austria

[4] Linz Inst Technol, AI Lab, Linz, Austria

来源：

SIGIR '21 - PROCEEDINGS OF THE 44TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL | 2021年

关键词：

uncertainty; neural networks; calibration; search;

D O I：

10.1145/3404835.3462951

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In any ranking system, the retrieval model outputs a single score for a document based on its belief on how relevant it is to a given search query. While retrieval models have continued to improve with the introduction of increasingly complex architectures, few works have investigated a retrieval model's belief in the score beyond the scope of a single value. We argue that capturing the model's uncertainty with respect to its own scoring of a document is a critical aspect of retrieval that allows for greater use of current models across new document distributions, collections, or even improving effectiveness for down-stream tasks. In this paper, we address this problem via an efficient Bayesian framework for retrieval models which captures the model's belief in the relevance score through a stochastic process while adding only negligible computational overhead. We evaluate this belief via a ranking based calibration metric showing that our approximate Bayesian framework significantly improves a retrieval model's ranking effectiveness through a risk aware reranking as well as its confidence calibration. Lastly, we demonstrate that this additional uncertainty information is actionable and reliable on down-stream tasks represented via cutoff prediction.

引用

页码：654 / 664

页数：11

共 59 条

[1] [Anonymous], 2008, SIGIR, DOI [DOI 10.1145/1390334.1390446, 10.1145/, DOI 10.1145/1390334]
[2] [Anonymous], 1998, P SIGIR, DOI [DOI 10.1145/290941.291008, 10.1145/290941.291008]
[3] [Anonymous], 2010, ICML
[4] Aslam JA, 2007, LECT NOTES COMPUT SC, V4425, P198
[5] Chow Yinlam, 2015, Advances in Neural Information Processing Systems, V28, P1522
[6] Cross Domain Regularization for Neural Ranking Models using Adversarial Learning
Cohen, Daniel
Mitra, Bhaskar
Hofmann, Katja
Croft, W. Bruce
[J]. ACM/SIGIR PROCEEDINGS 2018, 2018, : 1025 - 1028
[7] Craswell Nick., 2019, Overview of the TREC 2019 deep learning track
[8] Culpepper J. Shane, 2016, P 21 AUSTR S, P17
[9] Cummins R, 2011, PROCEEDINGS OF THE 34TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR'11), P1089
[10] Deeper Text Understanding for IR with Contextual Neural Language Modeling
Dai, Zhuyun
Callan, Jamie
[J]. PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19), 2019, : 985 - 988

← 1 2 3 4 5 6 →