Privacy-Preserving Deep Reinforcement Learning based on Differential Privacy

被引:0
作者
Zhao, Wenxu [1 ]
Sang, Yingpeng [1 ]
Xiong, Neal [2 ]
Tian, Hui [3 ]
机构
[1] Sun Yat Sen Univ, Guangzhou, Peoples R China
[2] Sul Ross State Univ, Alpine, TX USA
[3] Griffith Univ, Nathan, Qld, Australia
来源
2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024 | 2024年
关键词
Deep Reinforcement Learning; Differential Privacy; Deep Q Network; REINFORCE; ALGORITHMS;
D O I
10.1109/IJCNN60899.2024.10650755
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Deep reinforcement learning, with its extensive applications and remarkable performance, is emerging as a pivotal technology garnering researchers' attention. During the training process, there are frequent interaction and data exchange between agents and the environment, and the interaction information during training is closely tied to the training environment. Consequently, this process introduces a high risk of environmental privacy leakage. Malicious third parties may potentially steal state transition matrix or environmental information about the application domain of agent training, resulting in the compromise of user privacy. To address this issue, we propose novel differentially private value-based and policy-based deep reinforcement learning algorithms. Our methods have an advantage of being adaptable to various environmental privacy concerns. We also evaluate them in a customized experimental environment. Comparative experiments are conducted between the original and differentially private versions of the algorithms. The results indicate that our proposed approach can provide differential privacy protection to environmental information with minimal impact on algorithm performance, ultimately achieving a good balance between privacy and utility.
引用
收藏
页数:8
相关论文
共 33 条
[1]   Privacy-Preserving in Double Deep-Q-Network with Differential Privacy in Continuous Spaces [J].
Abahussein, Suleiman ;
Cheng, Zishuo ;
Zhu, Tianqing ;
Ye, Dayong ;
Zhou, Wanlei .
AI 2021: ADVANCES IN ARTIFICIAL INTELLIGENCE, 2022, 13151 :15-26
[2]  
Brockman Greg, 2016, arXiv
[3]   A Reinforcement Learning-Empowered Feedback Control System for Industrial Internet of Things [J].
Chen, Xing ;
Hu, Junqin ;
Chen, Zheyi ;
Lin, Bing ;
Xiong, Naixue ;
Min, Geyong .
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (04) :2724-2733
[4]   Generative Adversarial Networks: A Literature Review [J].
Cheng, Jieren ;
Yang, Yue ;
Tang, Xiangyan ;
Xiong, Naixue ;
Zhang, Yuan ;
Lei, Feifei .
KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS, 2020, 14 (12) :4625-4647
[5]   Calibrating noise to sensitivity in private data analysis [J].
Dwork, Cynthia ;
McSherry, Frank ;
Nissim, Kobbi ;
Smith, Adam .
THEORY OF CRYPTOGRAPHY, PROCEEDINGS, 2006, 3876 :265-284
[6]   The Algorithmic Foundations of Differential Privacy [J].
Dwork, Cynthia ;
Roth, Aaron .
FOUNDATIONS AND TRENDS IN THEORETICAL COMPUTER SCIENCE, 2013, 9 (3-4) :211-406
[7]  
Farahmand Amir-massoud., 2011, Advances in Neural Information Processing Systems, V24
[8]  
Garcelon E, 2021, ADV NEUR IN, V34
[9]   Distributed k-connected fault-tolerant topology control algorithms with PSO in future autonomic sensor systems [J].
Guo, Wenzhong ;
Xiong, Naixue ;
Vasilakos, Athanasios V. ;
Chen, Guolong ;
Yu, Chaolong .
INTERNATIONAL JOURNAL OF SENSOR NETWORKS, 2012, 12 (01) :53-62
[10]   An Intelligent Collaboration Trust Interconnections System for Mobile Information Control in Ubiquitous 5G Networks [J].
Huang, Shaobo ;
Zeng, Zhiwen ;
Ota, Kaoru ;
Dong, Mianxiong ;
Wang, Tian ;
Xiong, Neal N. .
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING, 2021, 8 (01) :347-365