FedSlate: A Federated Deep Reinforcement Learning Recommender System

被引:0
作者
Deng, Yongxin [1 ]
Qiu, Xihe [1 ]
Tan, Xiaoyu [2 ]
Jin, Yaochu [3 ]
机构
[1] Shanghai Univ Engn Sci, Sch Elect & Elect Engn, Shanghai 201620, Peoples R China
[2] INFLY TECH Shanghai Co Ltd, Shanghai 200232, Peoples R China
[3] Westlake Univ, Sch Engn, Hangzhou 310030, Peoples R China
来源
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE | 2025年
关键词
Recommender systems; Training; Federated learning; Servers; Adaptation models; Deep reinforcement learning; Privacy; Law; Distributed databases; Data models; Recommender system; reinforcement learning; federated learning; vertical federated learning; privacy preservation;
D O I
10.1109/TETCI.2025.3573250
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning methods have been used to optimize long-term user engagement in recommendation systems. However, existing reinforcement learning-based recommendation systems do not fully exploit the relevance of individual user behavior across different platforms. One potential solution is to aggregate data from various platforms in a centralized location and use the aggregated data for training. However, this approach raises economic and legal concerns, including increased communication costs and potential threats to user privacy. To address these challenges, we propose FedSlate, a federated reinforcement learning recommendation algorithm that effectively utilizes information that is prohibited from being shared at a legal level. We employ the SlateQ algorithm to assist FedSlate in learning users' long-term behavior and evaluating the value of recommended content. We extend the existing application scope of recommendation systems from single-user single-platform to single-user multi-platform and address cross-platform learning challenges by introducing federated learning. We use RecSim to construct a simulation environment for evaluating FedSlate and compare its performance with state-of-the-art benchmark recommendation models. Experimental results demonstrate the superior effects of FedSlate over baseline methods in various environmental settings, and FedSlate facilitates the learning of recommendation strategies in scenarios where baseline methods are completely inapplicable.
引用
收藏
页数:15
相关论文
共 52 条
[1]   Reinforcement Learning based Recommender Systems: A Survey [J].
Afsar, M. Mehdi ;
Crump, Trafford ;
Far, Behrouz .
ACM COMPUTING SURVEYS, 2023, 55 (07)
[2]   DQRE-SCnet: A novel hybrid approach for selecting users in Federated Learning with Deep-Q-Reinforcement Learning based on Spectral Clustering [J].
Ahmadi, Mohsen ;
Taghavirashidizadeh, Ali ;
Javaheri, Danial ;
Masoumian, Armin ;
Ghoushchi, Saeid Jafarzadeh ;
Pourasad, Yaghoub .
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES, 2022, 34 (09) :7445-7458
[3]  
[Anonymous], 2001, Stated Choice Methods: Analysis and Application
[4]  
Arjevani Y, 2015, ADV NEUR IN, V28
[5]   Communication-Efficient Distributed Learning: An Overview [J].
Cao, Xuanyu ;
Basar, Tamer ;
Diggavi, Suhas ;
Eldar, Yonina C. ;
Letaief, Khaled B. ;
Poor, H. Vincent ;
Zhang, Junshan .
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, 2023, 41 (04) :851-873
[6]   Top-K Off-Policy Correction for a REINFORCE Recommender System [J].
Chen, Minmin ;
Beutel, Alex ;
Covington, Paul ;
Jain, Sagar ;
Belletti, Francois ;
Chi, Ed H. .
PROCEEDINGS OF THE TWELFTH ACM INTERNATIONAL CONFERENCE ON WEB SEARCH AND DATA MINING (WSDM'19), 2019, :456-464
[7]   A model-based hybrid soft actor-critic deep reinforcement learning algorithm for optimal ventilator settings [J].
Chen, Shaotao ;
Qiu, Xihe ;
Tan, Xiaoyu ;
Fang, Zhijun ;
Jin, Yaochu .
INFORMATION SCIENCES, 2022, 611 :47-64
[8]  
Chen XC, 2021, Arxiv, DOI arXiv:2109.03540
[9]   A Survey on the use of Federated Learning in Privacy-Preserving Recommender Systems [J].
Chronis, Christos ;
Varlamis, Iraklis ;
Himeur, Yassine ;
Sayed, Aya N. ;
AL-Hasan, Tamim M. ;
Nhlabatsi, Armstrong ;
Bensaali, Faycal ;
Dimitrakopoulos, George .
IEEE OPEN JOURNAL OF THE COMPUTER SOCIETY, 2024, 5 :227-247
[10]  
Craswell N., 2008, P 2008 INT C WEB SEA, P87, DOI [10.1016/j.jebo.2018.02.007, DOI 10.1145/1341531.1341545, 10.1145/1341531]