A Reinforcement Learning-Based Control Approach for Unknown Nonlinear Systems with Persistent Adversarial Inputs

被引：1

作者：

Zhong, Xiangnan ^{[1
]}

He, Haibo ^{[2
]}

机构：

[1] Florida Atlantic Univ, Dept Comp & Elect Engn & Comp Sci, Boca Raton, FL 33431 USA

[2] Univ Rhode Isl, Dept Elect Comp & Biomed Engn, Kingston, RI 02881 USA

来源：

2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN) | 2021年

基金：

美国国家科学基金会;

关键词：

Reinforcement learning; zero-sum games; neural networks; observer; online learning and control; TRACKING CONTROL; GAME; ADP; GO;

D O I：

10.1109/IJCNN52387.2021.9534429

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper develops an intelligent control method based on reinforcement learning techniques for unknown nonlinear continuous-time systems in an adversarial environment. The developed method can automatically learn the optimal control input for the system and also predict the worst case adversarial input that one adversary can bring into. Besides, we assume that the agent can only observe partial information of the environment during the learning process. Therefore, a neural network-based observer is developed to adaptively reconstruct the hidden states and dynamics. Then, theoretical analysis is provided to show the stability of the developed intelligent control and the accuracy of the established observer. This method has been applied on a torsional pendulum system and the results demonstrate the effectiveness of the designed approach.

引用

页数：8

共 56 条

[1] Neurodynamic programming and zero-sum games for constrained control systems [J].

Abu-Khalaf, Murad ;

Lewis, Frank L. ;

Huang, Jie .

IEEE TRANSACTIONS ON NEURAL NETWORKS, 2008, 19 (07) :1243-1252

[2]

[Anonymous], 2010, NEURAL NETWORK BASED

[3]

Barto AG, 1998, REINFORCEMENT LEARNI

[4]

Basar T., 2008, H-Optimal Control and Related Minimax Design Problems: A Dynamic Game Approach

[5]

Brooks RA., 1991, P 12 INT JOINT C ART, P569, DOI DOI 10.1007/BF01538672

[6]

Christiano PF, 2017, ADV NEUR IN, V30

[7] Optimal Control of Affine Nonlinear Continuous-time Systems Using an Online Hamilton-Jacobi-Isaacs Formulation [J].

Dierks, T. ;

Jagannathan, S. .

49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, :3048-3053

[8] Robust Adaptive Dynamic Programming of Two-Player Zero-Sum Games for Continuous-Time Linear Systems [J].

Fu, Yue ;

Fu, Jun ;

Chai, Tianyou .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2015, 26 (12) :3314-3319

[9]

Jaakkola T., 1995, Advances in Neural Information Processing Systems 7, P345

[10] Robust Adaptive Dynamic Programming and Feedback Stabilization of Nonlinear Systems [J].

Jiang, Yu ;

Jiang, Zhong-Ping .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) :882-893

← 1 2 3 4 5 6 →