Keeping Dataset Biases out of the Simulation A Debiased Simulator for Reinforcement Learning based Recommender Systems

被引：45

作者：

Huang, Jin ^{[1
]}

Oosterhuis, Harrie ^{[1
]}

de Rijke, Maarten ^{[1
,2
]}

van Hoof, Herke ^{[1
]}

机构：

[1] Univ Amsterdam, Amsterdam, Netherlands

[2] Ahold Delhaize, Amsterdam, Netherlands

来源：

RECSYS 2020: 14TH ACM CONFERENCE ON RECOMMENDER SYSTEMS | 2020年

关键词：

Reinforcement learning; Recommender systems; Simulation; Interaction bias; ENVIRONMENT;

D O I：

10.1145/3383313.3412252

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning for recommendation (RL4Rec) methods are increasingly receiving attention as an effective way to improve long-term user engagement. However, applying RL4Rec online comes with risks: exploration may lead to periods of detrimental user experience. Moreover, few researchers have access to real-world recommender systems. Simulations have been put forward as a solution where user feedback is simulated based on logged historical user data, thus enabling optimization and evaluation without being run online. While simulators do not risk the user experience and are widely accessible, we identify an important limitation of existing simulation methods. They ignore the interaction biases present in logged user data, and consequently, these biases affect the resulting simulation. As a solution to this issue, we introduce a debiasing step in the simulation pipeline, which corrects for the biases present in the logged data before it is used to simulate user behavior. To evaluate the effects of bias on RL4Rec simulations, we propose a novel evaluation approach for simulators that considers the performance of policies optimized with the simulator. Our results reveal that the biases from logged data negatively impact the resulting policies, unless corrected for with our debiasing method. While our debiasing methods can be applied to any simulator, we make our complete pipeline publicly available as the Simulator for OFfline leArning and evaluation (SOFA): the first simulator that accounts for interaction biases prior to optimization and evaluation.

引用

页码：190 / 199

页数：10

共 50 条

[21] X-Wines: A Wine Dataset for Recommender Systems and Machine Learning
de Azambuja, Rogerio Xavier
Morais, A. Jorge
Filipe, Vitor
BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (01)
[22] Deep Learning Based Recommender Systems
Akay, Bahriye
Kaynar, Oguz
Demirkoparan, Ferhan
2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 645 - 648
[23] Deep Learning Based Recommender Systems
Ouhbi, Brahim
Frikh, Bouchra
Zemmouri, El Moukhtar
Abbad, Abdellwahed
2018 IEEE 5TH INTERNATIONAL CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'18), 2018, : 161 - 166
[24] Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
Xu, Ruiyang
Bhandari, Jalaj
Korenkevych, Dmytro
Liu, Fan
He, Yuchen
Nikulkov, Alex
Zhu, Zheqing
PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 955 - 962
[25] Causal Decision Transformer for Recommender Systems via Offline Reinforcement Learning
Wang, Siyu
Chen, Xiaocong
Jannach, Dietmar
Yao, Lina
PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1599 - 1608
[26] A Review of Explainable Recommender Systems Utilizing Knowledge Graphs and Reinforcement Learning
Tiwary, Neeraj
Noah, Shahrul Azman Mohd
Fauzi, Fariza
Yee, Tan Siok
IEEE ACCESS, 2024, 12 : 91999 - 92019
[27] Bias and Unfairness of Collaborative Filtering Based Recommender Systems in MovieLens Dataset
Gonzalez, Alvaro
Ortega, Fernando
Perez-Lopez, Diego
Alonso, Santiago
IEEE ACCESS, 2022, 10 : 68429 - 68439
[28] PGA-DRL: Progressive graph attention-based deep reinforcement learning for recommender systems
Tanveer, Jawad
Lee, Sang-Woong
Rahmani, Amir Masoud
Aurangzeb, Khursheed
Alam, Mahfooz
Zare, Gholamreza
Alamdari, Pegah Malekpour
Hosseinzadeh, Mehdi
INFORMATION FUSION, 2025, 121
[29] Automatic Music Playlist Generation via Simulation-based Reinforcement Learning
Tomasi, Federico
Cauteruccio, Joseph
Kanoria, Surya
Ciosek, Kamil
Rinaldi, Matteo
Dai, Zhenwen
PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 4948 - 4957
[30] Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling
Wang, Jie
Karatzoglou, Alexandros
Arapakis, Ioannis
Jose, Joemon M.
PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 375 - 385

← 1 2 3 4 5 →