Keeping Dataset Biases out of the Simulation A Debiased Simulator for Reinforcement Learning based Recommender Systems

被引：45

作者：

Huang, Jin ^{[1
]}

Oosterhuis, Harrie ^{[1
]}

de Rijke, Maarten ^{[1
,2
]}

van Hoof, Herke ^{[1
]}

机构：

[1] Univ Amsterdam, Amsterdam, Netherlands

[2] Ahold Delhaize, Amsterdam, Netherlands

来源：

RECSYS 2020: 14TH ACM CONFERENCE ON RECOMMENDER SYSTEMS | 2020年

关键词：

Reinforcement learning; Recommender systems; Simulation; Interaction bias; ENVIRONMENT;

D O I：

10.1145/3383313.3412252

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning for recommendation (RL4Rec) methods are increasingly receiving attention as an effective way to improve long-term user engagement. However, applying RL4Rec online comes with risks: exploration may lead to periods of detrimental user experience. Moreover, few researchers have access to real-world recommender systems. Simulations have been put forward as a solution where user feedback is simulated based on logged historical user data, thus enabling optimization and evaluation without being run online. While simulators do not risk the user experience and are widely accessible, we identify an important limitation of existing simulation methods. They ignore the interaction biases present in logged user data, and consequently, these biases affect the resulting simulation. As a solution to this issue, we introduce a debiasing step in the simulation pipeline, which corrects for the biases present in the logged data before it is used to simulate user behavior. To evaluate the effects of bias on RL4Rec simulations, we propose a novel evaluation approach for simulators that considers the performance of policies optimized with the simulator. Our results reveal that the biases from logged data negatively impact the resulting policies, unless corrected for with our debiasing method. While our debiasing methods can be applied to any simulator, we make our complete pipeline publicly available as the Simulator for OFfline leArning and evaluation (SOFA): the first simulator that accounts for interaction biases prior to optimization and evaluation.

引用

页码：190 / 199

页数：10

共 50 条

[1] Reinforcement Learning based Recommender Systems: A Survey
Afsar, M. Mehdi
Crump, Trafford
Far, Behrouz
ACM COMPUTING SURVEYS, 2023, 55 (07)
[2] A Survey on Reinforcement Learning and Deep Reinforcement Learning for Recommender Systems
Rezaei, Mehrdad
Tabrizi, Nasseh
DEEP LEARNING THEORY AND APPLICATIONS, DELTA 2023, 2023, 1875 : 385 - 402
[3] PyRecGym: A Reinforcement Learning Gym for Recommender Systems
Shi, Bichen
Ozsoy, Makbule Gulcin
Hurley, Neil
Smyth, Barry
Tragos, Elias Z.
Geraci, James
Lawlor, Aonghus
RECSYS 2019: 13TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, 2019, : 491 - 495
[4] User Tampering in Reinforcement Learning Recommender Systems
Kasirzadeh, Atoosa
Evans, Charles
PROCEEDINGS OF THE 2023 AAAI/ACM CONFERENCE ON AI, ETHICS, AND SOCIETY, AIES 2023, 2023, : 58 - 69
[5] REVEAL 2022: Reinforcement Learning-Based Recommender Systems at Scale
Li, Ying
Basilico, Justin
Raimond, Yves
Dimakopoulou, Maria
Liaw, Richard
Bailey, Paige
PROCEEDINGS OF THE 16TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2022, 2022, : 684 - 685
[6] A Survey on Reinforcement Learning for Recommender Systems
Lin, Yuanguo
Liu, Yong
Lin, Fan
Zou, Lixin
Wu, Pengcheng
Zeng, Wenhua
Chen, Huanhuan
Miao, Chunyan
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (10) : 13164 - 13184
[7] Bounding System-Induced Biases in Recommender Systems with a Randomized Dataset
Liu, Dugang
Cheng, Pengxiang
Lin, Zinan
Zhang, Xiaolian
Dong, Zhenhua
Zhang, Rui
He, Xiuqiang
Pan, Weike
Ming, Zhong
ACM TRANSACTIONS ON INFORMATION SYSTEMS, 2023, 41 (04)
[8] Contrastive Learning for Debiased Candidate Generation in Large-Scale Recommender Systems
Zhou, Chang
Ma, Jianxin
Zhang, Jianwei
Zhou, Jingren
Yang, Hongxia
KDD '21: PROCEEDINGS OF THE 27TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY & DATA MINING, 2021, : 3985 - 3995
[9] Efficient Integration of Reinforcement Learning in Graph Neural Networks-Based Recommender Systems
Sharifbaev, Abdurakhmon
Mozikov, Mikhail
Zaynidinov, Hakimjon
Makarov, Ilya
IEEE ACCESS, 2024, 12 : 189439 - 189448
[10] Contrastive State Augmentations for Reinforcement Learning-Based Recommender Systems
Ren, Zhaochun
Huang, Na
Wang, Yidan
Ren, Pengjie
Ma, Jun
Lei, Jiahuan
Shi, Xinlei
Luo, Hengliang
Jose, Joemon
Xin, Xin
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 922 - 931

← 1 2 3 4 5 →