Differentially Private Reinforcement Learning with Linear Function Approximation

被引：4

作者：

Zhou, Xingyu ^{[1
]}

机构：

[1] Wayne State Univ, 5050 Anthony Wayne Dr, Detroit, MI 48202 USA

来源：

PROCEEDINGS OF THE ACM ON MEASUREMENT AND ANALYSIS OF COMPUTING SYSTEMS | 2022年 / 6卷 / 01期

关键词：

reinforcement learning; differential privacy; linear function approximations; ALGORITHMS; DESIGN;

D O I：

10.1145/3508028

中图分类号：

TP3 [计算技术、计算机技术];

学科分类号：

0812 ;

摘要：

Motivated by the wide adoption of reinforcement learning (RL) in real-world personalized services , where users' sensitive and private information needs to be protected, we study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP). Compared to existing private RL algorithms that work only on tabular finite-state, finite-actions MDPs, we take the first step towards privacy-preserving learning in MDPs with large state and action spaces. Specifically, we consider MDPs with linear function approximation (in particular linear mixture MDPs) under the notion of joint differential privacy (JDP), where the RL agent is responsible for protecting users' sensitive data. We design two private RL algorithms that are based on value iteration and policy optimization, respectively, and show that they enjoy sub-linear regret performance while guaranteeing privacy protection. Moreover, the regret bounds are independent of the number of states, and scale at most logarithmically with the number of actions, making the algorithms suitable for privacy protection in nowadays large-scale personalized services. Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers, which not only generalizes previous results for non-private learning, but also serves as a building block for general private reinforcement learning.

引用

页数：27

共 50 条

[1] Differentially Private Reinforcement Learning
Ma, Pingchuan
Wang, Zhiqiang
Zhang, Le
Wang, Ruming
Zou, Xiaoxiang
Yang, Tao
INFORMATION AND COMMUNICATIONS SECURITY (ICICS 2019), 2020, 11999 : 668 - 683
[2] Provably Efficient Reinforcement Learning with Linear Function Approximation
Jin, Chi
Yang, Zhuoran
Wang, Zhaoran
Jordan, Michael, I
MATHEMATICS OF OPERATIONS RESEARCH, 2023, 48 (03) : 1496 - 1521
[3] Gaussian Based Non-linear Function Approximation for Reinforcement Learning
Haider A.
Hawe G.
Wang H.
Scotney B.
SN Computer Science, 2021, 2 (3)
[4] Reinforcement Learning-Based Personalized Differentially Private Federated Learning
Lu, Xiaozhen
Liu, Zihan
Xiao, Liang
Dai, Huaiyu
IEEE TRANSACTIONS ON INFORMATION FORENSICS AND SECURITY, 2025, 20 : 465 - 477
[5] Evolutionary function approximation for reinforcement learning
Whiteson, Shimon
Stone, Peter
JOURNAL OF MACHINE LEARNING RESEARCH, 2006, 7 : 877 - 917
[6] Reinforcement-Learning-Based Query Optimization in Differentially Private IoT Data Publishing
Jiang, Yili
Zhang, Kuan
Qian, Yi
Zhou, Liang
IEEE INTERNET OF THINGS JOURNAL, 2021, 8 (14) : 11163 - 11176
[7] Sigmoid-weighted linear units for neural network function approximation in reinforcement learning
Elfwing, Stefan
Uchibe, Eiji
Doya, Kenji
NEURAL NETWORKS, 2018, 107 : 3 - 11
[8] Multiagent reinforcement learning using function approximation
Abul, O
Polat, F
Alhajj, R
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS, 2000, 30 (04): : 485 - 497
[9] An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method
Joseph, Ajin George
Bhatnagar, Shalabh
MACHINE LEARNING, 2018, 107 (8-10) : 1385 - 1429
[10] An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method
Ajin George Joseph
Shalabh Bhatnagar
Machine Learning, 2018, 107 : 1385 - 1429

← 1 2 3 4 5 →