When People Change their Mind: Off-Policy Evaluation in Non-stationary Recommendation Environments

被引：38

作者：

Jagerman, Rolf ^{[1
]}

Markov, Ilya ^{[1
]}

de Rijke, Maarten ^{[1
]}

机构：

[1] Univ Amsterdam, Amsterdam, Netherlands

来源：

PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19) | 2019年

关键词：

Off-policy evaluation; Non-stationary rewards;

D O I：

10.1145/3289600.3290958

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We consider the novel problem of evaluating a recommendation policy offline in environments where the reward signal is non-stationary. Non-stationarity appears in many Information Retrieval (IR) applications such as recommendation and advertising, but its effect on off-policy evaluation has not been studied at all. We are the first to address this issue. First, we analyze standard off-policy estimators in non-stationary environments and show both theoretically and experimentally that their bias grows with time. Then, we propose new off-policy estimators with moving averages and show that their bias is independent of time and can be bounded. Furthermore, we provide a method to trade-off bias and variance in a principled way to get an off-policy estimator that works well in both non-stationary and stationary environments. We experiment on publicly available recommendation datasets and show that our newly proposed moving average estimators accurately capture changes in non-stationary environments, while standard off-policy estimators fail to do so.

引用

页码：447 / 455

页数：9

共 41 条

[1]

[Anonymous], 2012, P 21 INT C WORLD WID, DOI [DOI 10.1145/2187836.2187918, 10.1145/2187836.2187918]

[2]

[Anonymous], ARXIV180606535

[3]

[Anonymous], 2017, P 34 INT C MACH LEAR

[4]

[Anonymous], 2014, CIKM 2014 23 ACM C I

[5]

[Anonymous], TECHNICAL REPORT

[6]

[Anonymous], P 41 INT ACM SIGIR C

[7]

[Anonymous], 2013, P 6 ACM INT C WEB SE

[8]

[Anonymous], 2012, P 28 C UNC ART INT U

[9]

[Anonymous], 2009, Proceedings of the Second ACM International Conference on Web Search and Data Mining. WSDM'09, DOI DOI 10.1145/1498759.1498825

[10]

[Anonymous], 2000, INT C MACH LEARN

← 1 2 3 4 5 →