Differentially Private Reinforcement Learning with Linear Function Approximation

被引:4
作者
Zhou, Xingyu [1 ]
机构
[1] Wayne State Univ, 5050 Anthony Wayne Dr, Detroit, MI 48202 USA
关键词
reinforcement learning; differential privacy; linear function approximations; ALGORITHMS; DESIGN;
D O I
10.1145/3508028
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Motivated by the wide adoption of reinforcement learning (RL) in real-world personalized services , where users' sensitive and private information needs to be protected, we study regret minimization in finite-horizon Markov decision processes (MDPs) under the constraints of differential privacy (DP). Compared to existing private RL algorithms that work only on tabular finite-state, finite-actions MDPs, we take the first step towards privacy-preserving learning in MDPs with large state and action spaces. Specifically, we consider MDPs with linear function approximation (in particular linear mixture MDPs) under the notion of joint differential privacy (JDP), where the RL agent is responsible for protecting users' sensitive data. We design two private RL algorithms that are based on value iteration and policy optimization, respectively, and show that they enjoy sub-linear regret performance while guaranteeing privacy protection. Moreover, the regret bounds are independent of the number of states, and scale at most logarithmically with the number of actions, making the algorithms suitable for privacy protection in nowadays large-scale personalized services. Our results are achieved via a general procedure for learning in linear mixture MDPs under changing regularizers, which not only generalizes previous results for non-private learning, but also serves as a building block for general private reinforcement learning.
引用
收藏
页数:27
相关论文
共 50 条
  • [41] DIFFERENTIALLY PRIVATE LEARNING OF GEOMETRIC CONCEPTS
    Kaplan H.
    Mansour Y.
    Matias Y.
    Stemmer U.
    [J]. SIAM Journal on Optimization, 2022, 32 (03) : 952 - 974
  • [42] Differentially Private Extreme Learning Machine
    Ono, Hajime
    Tran Thi Phuong
    Le Trieu Phong
    [J]. MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, MDAI 2024, 2024, 14986 : 165 - 176
  • [43] Differentially private ensemble learning for classification
    Li, Xianxian
    Liu, Jing
    Liu, Songfeng
    Wang, Jinyan
    [J]. NEUROCOMPUTING, 2021, 430 : 34 - 46
  • [44] Differentially private distributed estimation and learning
    Papachristou, Marios
    Rahimian, M. Amin
    [J]. IISE TRANSACTIONS, 2024, : 756 - 772
  • [45] DIFFERENTIALLY PRIVATE LEARNING OF GEOMETRIC CONCEPTS
    Kaplan, Haim
    Mansour, Yishay
    Matias, Yossi
    Stemmer, Uri
    [J]. SIAM JOURNAL ON COMPUTING, 2022, 51 (04) : 952 - 974
  • [46] Old Techniques in Differentially Private Linear Regression
    Sheffet, Or
    [J]. ALGORITHMIC LEARNING THEORY, VOL 98, 2019, 98
  • [47] Unconditional Differentially Private Mechanisms for Linear Queries
    Bhaskara, Aditya
    Dadush, Daniel
    Krishnaswamy, Ravishankar
    Talwar, Kunal
    [J]. STOC'12: PROCEEDINGS OF THE 2012 ACM SYMPOSIUM ON THEORY OF COMPUTING, 2012, : 1269 - 1283
  • [48] Differentially Private Hypothesis Testing for Linear Regression
    Alabi, Daniel G.
    Vadhan, Salil P.
    [J]. JOURNAL OF MACHINE LEARNING RESEARCH, 2023, 24
  • [49] On the convergence of temporal-difference learning with linear function approximation
    Tadic, V
    [J]. MACHINE LEARNING, 2001, 42 (03) : 241 - 267
  • [50] On the Convergence of Temporal-Difference Learning with Linear Function Approximation
    Vladislav Tadić
    [J]. Machine Learning, 2001, 42 : 241 - 267