A General Framework for Counterfactual Learning-to-Rank

被引:87
作者
Agarwal, Aman [1 ]
Takatsu, Kenta [1 ]
Zaitsev, Ivan [1 ]
Joachims, Thorsten [1 ]
机构
[1] Cornell Univ, Ithaca, NY 14850 USA
来源
PROCEEDINGS OF THE 42ND INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL (SIGIR '19) | 2019年
关键词
Learning to rank; presentation bias; counterfactual inference;
D O I
10.1145/3331184.3331202
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Implicit feedback (e.g., click, dwell time) is an attractive source of training data for Learning-to-Rank, but its naive use leads to learning results that are distorted by presentation bias. For the special case of optimizing average rank for linear ranking functions, however, the recently developed SVM-PropRank method has shown that counterfactual inference techniques can be used to provably overcome the distorting effect of presentation bias. Going beyond this special case, this paper provides a general and theoretically rigorous framework for counterfactual learning-to-rank that enables unbiased training for a broad class of additive ranking metrics (e.g., Discounted Cumulative Gain (DCG)) as well as a broad class of models (e.g., deep networks). Specifically, we derive a relaxation for propensity-weighted rank-based metrics which is subdifferentiable and thus suitable for gradient-based optimization. We demonstrate the effectiveness of this general approach by instantiating two new learning methods. One is a new type of unbiased SVM that optimizes DCG - called SVM PropDCG -, and we show how the resulting optimization problem can be solved via the Convex Concave Procedure (CCP). The other is Deep PropDCG, where the ranking function can be an arbitrary deep network. In addition to the theoretical support, we empirically find that SVM PropDCG significantly outperforms existing linear rankers in terms of DCG. Moreover, the ability to train non-linear ranking functions via Deep PropDCG further improves performance.
引用
收藏
页码:5 / 14
页数:10
相关论文
共 36 条
[1]   Estimating Position Bias without Intrusive Interventions [J].
Agarwal, Aman ;
Zaitsev, Ivan ;
Wang, Xuanhui ;
Li, Cheng ;
Najork, Marc ;
Joachims, Thorsten .
PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, :474-482
[2]   Unbiased Learning to Rank with Unbiased Propensity Estimation [J].
Ai, Qingyao ;
Bi, Keping ;
Luo, Cheng ;
Guo, Jiafeng ;
Croft, W. Bruce .
ACM/SIGIR PROCEEDINGS 2018, 2018, :385-394
[3]  
[Anonymous], ACM C RES DEV INF RE
[4]  
[Anonymous], 2002, Learning with Kernels
[5]   A Neural Click Model for Web Search [J].
Borisov, Alexey ;
Markov, Ilya ;
de Rijke, Maarten ;
Serdyukov, Pavel .
PROCEEDINGS OF THE 25TH INTERNATIONAL CONFERENCE ON WORLD WIDE WEB (WWW'16), 2016, :531-541
[6]  
Burges C., 2005, P 22 INT C MACH LEAR, P89, DOI DOI 10.1145/1102351.1102363
[7]  
Burges CJ, 2007, ADV NEURAL INFORM PR, P193, DOI DOI 10.1007/S10994-010-5185-8
[8]  
Chapelle CA, 2009, CAM APPL L, P1
[9]   Gradient descent optimization of smoothed information retrieval metrics [J].
Chapelle, Olivier ;
Wu, Mingrui .
INFORMATION RETRIEVAL, 2010, 13 (03) :216-235
[10]  
Chuklin A., 2015, Click Models for Web Search