Deep Reinforecement Learning Based Optimal Defense for Cyber-Physical System in presence of Unknown Cyber-attack

被引:0
作者
Feng, Ming [1 ]
Xu, Hao [1 ]
机构
[1] Univ Nevada, Dept Elect & Biomed Engn, Reno, NV 89557 USA
来源
2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI) | 2017年
关键词
Cyber-physical systems; cyber state dynamics; cyber-attack; deep reinforcement learning; game theory; NONLINEAR-SYSTEMS; THEORETIC METHODS; GAME; SECURITY;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, the online optimal cyber-defense problem has been investigated for Cyber-Physical Systems (CPS) with unknown cyber-attacks. Firstly, a novel cyber state dynamics has been generated that can evaluate the real-time impacts from current cyber-attack and defense strategies effectively and dynamically. Next, adopting game theory technique, the idea optimal defense design can be obtained by using the full knowledge of cyber-state dynamics. To relax the requirement about cyberstate dynamics, a game-theoretical actor-critic neural network (NN) structure was developed to efficiently learn the optimal cyber defense strategy online. Moreover, to further improve the practicality of developed scheme, a novel deep reinforcement learning algorithm have been designed and implemented into actor-critic NN structure. Eventually, the numerical simulation demonstrate that proposed deep reinforcement learning based optimal defense strategy cannot only online defend the CPS even in presence of unknown cyber-attacks, and also learn the optimal defense policy more accurate and timely.
引用
收藏
页码:1642 / 1649
页数:8
相关论文
共 32 条
[1]   Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation [J].
Abu-Khalaf, Murad ;
Lewis, Frank L. ;
Huang, Jie .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) :1989-1995
[2]  
Ahmed S.H., 2013, 2013 IFIP Wireless Days (WD), P1, DOI [DOI 10.1109/WD.2013.6686528, 10.1109/WD.2013.6686528 (ver p. 26, DOI 10.1109/WD.2013.6686528(VERP.26]
[3]  
Ali S., 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST), P1
[4]  
[Anonymous], 2005, P 6 ACM INT S MOB AD, DOI DOI 10.1145/1062689.1062697
[5]  
[Anonymous], P 42 IEEE C DEC CONT
[6]   Optimal Cross-Layer Design of Sampling Rate Adaptation and Network Scheduling for Wireless Networked Control Systems [J].
Bai, Jia ;
Eyisi, Emeka P. ;
Qiu, Fan ;
Xue, Yuan ;
Koutsoukos, Xenofon D. .
2012 IEEE/ACM THIRD INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SYSTEMS (ICCPS 2012), 2012, :107-116
[7]  
Basar T., 1999, SIAMS CLASSIC APPL M, V23
[8]   Successive Galerkin approximation algorithms for nonlinear optimal and robust control [J].
Beard, RW ;
McLain, TW .
INTERNATIONAL JOURNAL OF CONTROL, 1998, 71 (05) :717-743
[9]  
Bernhard P., 1995, H-optimal control and related minimax design problems, V2nd
[10]   Ubiquitous Monitoring for Industrial Cyber-Physical Systems Over Relay-Assisted Wireless Sensor Networks [J].
Chen, Cailian ;
Yan, Jing ;
Lu, Ning ;
Wang, Yiyin ;
Yang, Xian ;
Guan, Xinping .
IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2015, 3 (03) :352-362