Deep Reinforecement Learning Based Optimal Defense for Cyber-Physical System in presence of Unknown Cyber-attack

被引：0

作者：

Feng, Ming ^{[1
]}

Xu, Hao ^{[1
]}

机构：

[1] Univ Nevada, Dept Elect & Biomed Engn, Reno, NV 89557 USA

来源：

2017 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI) | 2017年

关键词：

Cyber-physical systems; cyber state dynamics; cyber-attack; deep reinforcement learning; game theory; NONLINEAR-SYSTEMS; THEORETIC METHODS; GAME; SECURITY;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

In this paper, the online optimal cyber-defense problem has been investigated for Cyber-Physical Systems (CPS) with unknown cyber-attacks. Firstly, a novel cyber state dynamics has been generated that can evaluate the real-time impacts from current cyber-attack and defense strategies effectively and dynamically. Next, adopting game theory technique, the idea optimal defense design can be obtained by using the full knowledge of cyber-state dynamics. To relax the requirement about cyberstate dynamics, a game-theoretical actor-critic neural network (NN) structure was developed to efficiently learn the optimal cyber defense strategy online. Moreover, to further improve the practicality of developed scheme, a novel deep reinforcement learning algorithm have been designed and implemented into actor-critic NN structure. Eventually, the numerical simulation demonstrate that proposed deep reinforcement learning based optimal defense strategy cannot only online defend the CPS even in presence of unknown cyber-attacks, and also learn the optimal defense policy more accurate and timely.

引用

页码：1642 / 1649

页数：8

共 32 条

[1] Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturation [J].

Abu-Khalaf, Murad ;

Lewis, Frank L. ;

Huang, Jie .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (12) :1989-1995

[2]

Ahmed S.H., 2013, 2013 IFIP Wireless Days (WD), P1, DOI [DOI 10.1109/WD.2013.6686528, 10.1109/WD.2013.6686528 (ver p. 26, DOI 10.1109/WD.2013.6686528(VERP.26]

[3]

Ali S., 2015, 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST), P1

[4]

[Anonymous], 2005, P 6 ACM INT S MOB AD, DOI DOI 10.1145/1062689.1062697

[5]

[Anonymous], P 42 IEEE C DEC CONT

[6] Optimal Cross-Layer Design of Sampling Rate Adaptation and Network Scheduling for Wireless Networked Control Systems [J].

Bai, Jia ;

Eyisi, Emeka P. ;

Qiu, Fan ;

Xue, Yuan ;

Koutsoukos, Xenofon D. .

2012 IEEE/ACM THIRD INTERNATIONAL CONFERENCE ON CYBER-PHYSICAL SYSTEMS (ICCPS 2012), 2012, :107-116

[7]

Basar T., 1999, SIAMS CLASSIC APPL M, V23

[8] Successive Galerkin approximation algorithms for nonlinear optimal and robust control [J].

Beard, RW ;

McLain, TW .

INTERNATIONAL JOURNAL OF CONTROL, 1998, 71 (05) :717-743

[9]

Bernhard P., 1995, H-optimal control and related minimax design problems, V2nd

[10] Ubiquitous Monitoring for Industrial Cyber-Physical Systems Over Relay-Assisted Wireless Sensor Networks [J].

Chen, Cailian ;

Yan, Jing ;

Lu, Ning ;

Wang, Yiyin ;

Yang, Xian ;

Guan, Xinping .

IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTING, 2015, 3 (03) :352-362

← 1 2 3 4 →