Locally Differentially Private Reinforcement Learning for Linear Mixture Markov Decision Processes

被引:0
作者
Liao, Chonghua [1 ]
He, Jiafan [2 ]
Gu, Quanquan [2 ]
机构
[1] Tsinghua Univ, Dept Automat, Beijing, Peoples R China
[2] Univ Calif Los Angeles, Dept Comp Sci, Los Angeles, CA 90095 USA
来源
ASIAN CONFERENCE ON MACHINE LEARNING, VOL 189 | 2022年 / 189卷
基金
美国国家科学基金会;
关键词
Machine learning; Reinforcement learning; Differential privacy; Linear mixture MDPs;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) algorithms can be used to provide personalized services, which rely on users' private and sensitive data. To protect the users' privacy, privacy-preserving RL algorithms are in demand. In this paper, we study RL with linear function approximation and local differential privacy (LDP) guarantees. We propose a novel (epsilon, delta)-LDP algorithm for learning a class of Markov decision processes (MDPs) dubbed linear mixture MDPs, and obtains an (O) over tilde (d(5/4)H(7/4)T(3/4) (logp1/delta))(1/4) root 1/epsilon) regret, where d is the dimension of feature mapping, H is the length of the episodes, and T is the number of interactions with the environment. We also prove a lower bound Omega(dH root T/(e epsilon(e(epsilon) - 1))) for learning linear mixture MDPs under epsilon-LDP constraint. Experiments on synthetic datasets verify the effectiveness of our algorithm. To the best of our knowledge, this is the first provable privacy-preserving RL algorithm with linear function approximation.
引用
收藏
页数:16
相关论文
共 44 条
[1]   Deep Learning with Differential Privacy [J].
Abadi, Martin ;
Chu, Andy ;
Goodfellow, Ian ;
McMahan, H. Brendan ;
Mironov, Ilya ;
Talwar, Kunal ;
Zhang, Li .
CCS'16: PROCEEDINGS OF THE 2016 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, 2016, :308-318
[2]  
Abbasi- Yadkori Yasin, 2011, P ADV NEUR INF PROC, V11, P2312
[3]  
Ayoub A., 2020, INT C MACHINE LEARNI, P463
[4]  
Azar MG, 2017, PR MACH LEARN RES, V70
[5]  
Balle B, 2016, PR MACH LEARN RES, V48
[6]  
Basu D, 2020, Arxiv, DOI arXiv:1905.12298
[7]   ESTIMATION OF DENSITIES - MINIMAL RISK [J].
BRETAGNOLLE, J ;
HUBER, C .
ZEITSCHRIFT FUR WAHRSCHEINLICHKEITSTHEORIE UND VERWANDTE GEBIETE, 1979, 47 (02) :119-137
[8]  
Cai Q., 2020, INT C MACHINE LEARNI, P1283
[9]  
Chen X., 2020, INT C MACHINE LEARNI, P1757
[10]  
Cover T. A., 2006, Elements of information theory, V2nd