Data-Driven Nearly Optimal Control for Constrained Nonlinear Systems

被引：0

作者：

Yang, Xiong ^{[1
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

来源：

PROCEEDINGS OF 2020 IEEE 9TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS'20) | 2020年

基金：

中国国家自然科学基金;

关键词：

Asymmetric constraint; Data-driven control; Optimal control policy; Reinforcement learning; DESIGN;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This article develops a novel data-driven policy iteration (PI) to obtain nearly optimal control of nonlinear systems with asymmetric input constraints. The data-driven PI is derived from an early established model-based PI. Owing to the datadriven PI sharing the same solution as the model-based PI, the convergence of the data-driven PI algorithm is guaranteed. The implementation of the newly developed data-driven PI algorithm relies on an actor-critic structure consisting of two kinds of neural networks (NNs). Specifically, the critic NN aims at estimating the value function and the actor NNs aim at approximating the control policies. The weight parameters used in the critic and actor NNs are determined via the least squares method together with the Monte Carlo integration technique. Finally, a nonlinear plant is provided to validate the proposed data-driven PI algorithm.

引用

页码：105 / 110

页数：6

共 11 条

[1]

Abu-Khalaf M., 2006, NONLINEAR H2 H1 CONS

[2] Robust Adaptive Dynamic Programming and Feedback Stabilization of Nonlinear Systems [J].

Jiang, Yu ;

Jiang, Zhong-Ping .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2014, 25 (05) :882-893

[3] Optimal and Autonomous Control Using Reinforcement Learning: A Survey [J].

Kiumarsi, Bahare ;

Vamvoudakis, Kyriakos G. ;

Modares, Hamidreza ;

Lewis, Frank L. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2018, 29 (06) :2042-2062

[4]

Liu D, 2017, ADV IND CONTROL, P1, DOI 10.1007/978-3-319-50815-3

[5] Reinforcement learning solution for HJB equation arising in constrained optimal control problem [J].

Luo, Biao ;

Wu, Huai-Ning ;

Huang, Tingwen ;

Liu, Derong .

NEURAL NETWORKS, 2015, 71 :150-158

[6]

Rudin W, 1991, Functional analysis

[7]

Rudin W., 1976, Principles of Mathematical Analysis, V3

[8] Policy Iteration Algorithm for Online Design of Robust Control for a Class of Continuous-Time Nonlinear Systems [J].

Wang, Ding ;

Liu, Derong ;

Li, Hongliang .

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2014, 11 (02) :627-632

[9] Discrete-Time Local Value Iteration Adaptive Dynamic Programming: Convergence Analysis [J].

Wei, Qinglai ;

Lewis, Frank L. ;

Liu, Derong ;

Song, Ruizhuo ;

Lin, Hanquan .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2018, 48 (06) :875-891

[10] Event-Triggered Optimal Neuro-Controller Design With Reinforcement Learning for Unknown Nonlinear Systems [J].

Yang, Xiong ;

He, Haibo ;

Liu, Derong .

IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2019, 49 (09) :1866-1878

← 1 2 →