Intrinsically motivated reinforcement learning based recommendation with counterfactual data augmentation

被引:3
作者
Chen, Xiaocong [1 ]
Wang, Siyu [1 ]
Qi, Lianyong [2 ]
Li, Yong [3 ]
Yao, Lina [1 ,4 ]
机构
[1] Univ New South Wales, Sch Comp Sci & Engn, Sydney, NSW 2052, Australia
[2] China Univ Petr East China, Coll Comp Sci & Technol, Dong Ying Shi, Peoples R China
[3] Tsinghua Univ, Dept Elect Engn, Beijing 100084, Peoples R China
[4] CSIRO, Data 61, Eveleigh, NSW 2015, Australia
来源
WORLD WIDE WEB-INTERNET AND WEB INFORMATION SYSTEMS | 2023年 / 26卷 / 05期
关键词
Recommender systems; Deep reinforcement learning; Counterfactual reasoning; CAPACITY;
D O I
10.1007/s11280-023-01187-7
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Deep reinforcement learning (DRL) has shown promising results in modeling dynamic user preferences in RS in recent literature. However, training a DRL agent in the sparse RS environment poses a significant challenge. This is because the agent must balance between exploring informative user-item interaction trajectories and using existing trajectories for policy learning, a known exploration and exploitation trade-off. This trade-off greatly affects the recommendation performance when the environment is sparse. In DRL-based RS, balancing exploration and exploitation is even more challenging as the agent needs to deeply explore informative trajectories and efficiently exploit them in the context of RS. To address this issue, we propose a novel intrinsically motivated reinforcement learning (IMRL) method that enhances the agent's capability to explore informative interaction trajectories in the sparse environment. We further enrich these trajectories via an adaptive counterfactual augmentation strategy with a customised threshold to improve their efficiency in exploitation. Our approach is evaluated on six offline datasets and three online simulation platforms, demonstrating its superiority over existing state-of-the-art methods. The extensive experiments show that our IMRL method outperforms other methods in terms of recommendation performance in the sparse RS environment.
引用
收藏
页码:3253 / 3274
页数:22
相关论文
共 50 条
[41]   Disentangled variational auto-encoder enhanced by counterfactual data for debiasing recommendation [J].
Yupu Guo ;
Fei Cai ;
Jianming Zheng ;
Xin Zhang ;
Honghui Chen .
Complex & Intelligent Systems, 2024, 10 :3119-3132
[42]   Adaptive Data Replication Optimization Based on Reinforcement Learning [J].
Wee, Chee Keong ;
Nayak, Richi .
2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), 2020, :1210-1217
[43]   Deep Reinforcement Learning with Online Data Augmentation to Improve Sample Efficiency for Intelligent HVAC Control [J].
Kurte, Kuldeep ;
Amasyali, Kadir ;
Munk, Jeffrey ;
Zandi, Helia .
PROCEEDINGS OF THE 2022 THE 9TH ACM INTERNATIONAL CONFERENCE ON SYSTEMS FOR ENERGY-EFFICIENT BUILDINGS, CITIES, AND TRANSPORTATION, BUILDSYS 2022, 2022, :479-483
[44]   Sequential Recommendation Using Deep Reinforcement Learning and Multi-Head Attention [J].
Sultan, Raneem ;
Abu-Elkheir, Mervat .
2022 56TH ANNUAL CONFERENCE ON INFORMATION SCIENCES AND SYSTEMS (CISS), 2022, :258-262
[45]   Context-aware reinforcement learning for course recommendation [J].
Lin, Yuanguo ;
Lin, Fan ;
Yang, Lvqing ;
Zeng, Wenhua ;
Liu, Yong ;
Wu, Pengcheng .
APPLIED SOFT COMPUTING, 2022, 125
[46]   CapDRL: A Deep Capsule Reinforcement Learning for Movie Recommendation [J].
Zhao, Chenfei ;
Hu, Lan .
PRICAI 2019: TRENDS IN ARTIFICIAL INTELLIGENCE, PT III, 2019, 11672 :734-739
[47]   Impact of Learning Data Statistics on the Performance of a Recommendation System Based on MovieLens Data [J].
Kuzelewska, Urszula ;
Falkowski, Michal .
SYSTEM DEPENDABILITY-THEORY AND APPLICATIONS, DEPCOS-RELCOMEX 2024, 2024, 1026 :132-142
[48]   Integrating Offline Reinforcement Learning with Transformers for Sequential Recommendation [J].
Xi, Xumei ;
Zhao, Yuke ;
Liu, Quan ;
Ouyang, Liwen ;
Wu, Yang .
PROCEEDINGS OF THE 17TH ACM CONFERENCE ON RECOMMENDER SYSTEMS, RECSYS 2023, 2023, :1103-1108
[49]   Generative Inverse Deep Reinforcement Learning for Online Recommendation [J].
Chen, Xiaocong ;
Yao, Lina ;
Sun, Aixin ;
Wang, Xianzhi ;
Xu, Xiwei ;
Zhu, Liming .
PROCEEDINGS OF THE 30TH ACM INTERNATIONAL CONFERENCE ON INFORMATION & KNOWLEDGE MANAGEMENT, CIKM 2021, 2021, :201-210
[50]   Multi-behavior Session-based Recommendation via Graph Reinforcement Learning [J].
Qin, Shuo ;
Feng, Lin ;
Xu, Lingxiao ;
Deng, Bowen ;
Li, Siwen ;
Yang, Fancheng .
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 222, 2023, 222