Keeping Dataset Biases out of the Simulation A Debiased Simulator for Reinforcement Learning based Recommender Systems

被引:45
作者
Huang, Jin [1 ]
Oosterhuis, Harrie [1 ]
de Rijke, Maarten [1 ,2 ]
van Hoof, Herke [1 ]
机构
[1] Univ Amsterdam, Amsterdam, Netherlands
[2] Ahold Delhaize, Amsterdam, Netherlands
来源
RECSYS 2020: 14TH ACM CONFERENCE ON RECOMMENDER SYSTEMS | 2020年
关键词
Reinforcement learning; Recommender systems; Simulation; Interaction bias; ENVIRONMENT;
D O I
10.1145/3383313.3412252
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning for recommendation (RL4Rec) methods are increasingly receiving attention as an effective way to improve long-term user engagement. However, applying RL4Rec online comes with risks: exploration may lead to periods of detrimental user experience. Moreover, few researchers have access to real-world recommender systems. Simulations have been put forward as a solution where user feedback is simulated based on logged historical user data, thus enabling optimization and evaluation without being run online. While simulators do not risk the user experience and are widely accessible, we identify an important limitation of existing simulation methods. They ignore the interaction biases present in logged user data, and consequently, these biases affect the resulting simulation. As a solution to this issue, we introduce a debiasing step in the simulation pipeline, which corrects for the biases present in the logged data before it is used to simulate user behavior. To evaluate the effects of bias on RL4Rec simulations, we propose a novel evaluation approach for simulators that considers the performance of policies optimized with the simulator. Our results reveal that the biases from logged data negatively impact the resulting policies, unless corrected for with our debiasing method. While our debiasing methods can be applied to any simulator, we make our complete pipeline publicly available as the Simulator for OFfline leArning and evaluation (SOFA): the first simulator that accounts for interaction biases prior to optimization and evaluation.
引用
收藏
页码:190 / 199
页数:10
相关论文
共 50 条
  • [21] X-Wines: A Wine Dataset for Recommender Systems and Machine Learning
    de Azambuja, Rogerio Xavier
    Morais, A. Jorge
    Filipe, Vitor
    BIG DATA AND COGNITIVE COMPUTING, 2023, 7 (01)
  • [22] Deep Learning Based Recommender Systems
    Akay, Bahriye
    Kaynar, Oguz
    Demirkoparan, Ferhan
    2017 INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK), 2017, : 645 - 648
  • [23] Deep Learning Based Recommender Systems
    Ouhbi, Brahim
    Frikh, Bouchra
    Zemmouri, El Moukhtar
    Abbad, Abdellwahed
    2018 IEEE 5TH INTERNATIONAL CONGRESS ON INFORMATION SCIENCE AND TECHNOLOGY (IEEE CIST'18), 2018, : 161 - 166
  • [24] Optimizing Long-term Value for Auction-Based Recommender Systems via On-Policy Reinforcement Learning
    Xu, Ruiyang
    Bhandari, Jalaj
    Korenkevych, Dmytro
    Liu, Fan
    He, Yuchen
    Nikulkov, Alex
    Zhu, Zheqing
    PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, : 955 - 962
  • [25] Causal Decision Transformer for Recommender Systems via Offline Reinforcement Learning
    Wang, Siyu
    Chen, Xiaocong
    Jannach, Dietmar
    Yao, Lina
    PROCEEDINGS OF THE 46TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2023, 2023, : 1599 - 1608
  • [26] A Review of Explainable Recommender Systems Utilizing Knowledge Graphs and Reinforcement Learning
    Tiwary, Neeraj
    Noah, Shahrul Azman Mohd
    Fauzi, Fariza
    Yee, Tan Siok
    IEEE ACCESS, 2024, 12 : 91999 - 92019
  • [27] Bias and Unfairness of Collaborative Filtering Based Recommender Systems in MovieLens Dataset
    Gonzalez, Alvaro
    Ortega, Fernando
    Perez-Lopez, Diego
    Alonso, Santiago
    IEEE ACCESS, 2022, 10 : 68429 - 68439
  • [28] PGA-DRL: Progressive graph attention-based deep reinforcement learning for recommender systems
    Tanveer, Jawad
    Lee, Sang-Woong
    Rahmani, Amir Masoud
    Aurangzeb, Khursheed
    Alam, Mahfooz
    Zare, Gholamreza
    Alamdari, Pegah Malekpour
    Hosseinzadeh, Mehdi
    INFORMATION FUSION, 2025, 121
  • [29] Automatic Music Playlist Generation via Simulation-based Reinforcement Learning
    Tomasi, Federico
    Cauteruccio, Joseph
    Kanoria, Surya
    Ciosek, Kamil
    Rinaldi, Matteo
    Dai, Zhenwen
    PROCEEDINGS OF THE 29TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2023, 2023, : 4948 - 4957
  • [30] Reinforcement Learning-based Recommender Systems with Large Language Models for State Reward and Action Modeling
    Wang, Jie
    Karatzoglou, Alexandros
    Arapakis, Ioannis
    Jose, Joemon M.
    PROCEEDINGS OF THE 47TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, SIGIR 2024, 2024, : 375 - 385