A Digital Receding-Horizon Learning Controller for Nonlinear Continuous-time Systems

被引：2

作者：

Zhang, Xinglong ^{[1
]}

Li, Wenzhang ^{[1
]}

Xu, Xin ^{[1
]}

Jiang, Wei ^{[1
]}

机构：

[1] Natl Univ Def Technol, Coll Intelligence Sci & Technol, Changsha 410073, Peoples R China

来源：

IFAC PAPERSONLINE | 2020年 / 53卷 / 02期

基金：

国家重点研发计划; 中国国家自然科学基金;

关键词：

Reinforcement learning; receding horizon strategy; sampled-data control; continuous-time; nonlinear system; MODEL-PREDICTIVE CONTROL;

D O I：

10.1016/j.ifacol.2020.12.2297

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The integration of reinforcement learning (RL) and model predictive control (MPC) is promising for solving nonlinear optimization problems in an efficient manner. In this paper, a digital receding horizon learning controller is proposed for continuous-time nonlinear systems with control constraints. The main idea is to develop a digital design for RL with actor-critic design (ACD) in the framework of MPC, to realize near-optimal control of continuous-time nonlinear systems. Different from classic RL for continuous-time systems, the actor adopted is learned in discrete-time steps, while the critic evaluates the learned control policy continuously in the time domain Moreover, we use soft barrier functions to deal with control constraints and the robustness of the actor-critic network is proven. A simulation example is considered to show the effectiveness of the proposed approach. Copyright (C) 2020 The Authors.

引用

页码：8136 / 8141

页数：6

共 16 条

[1] CasADi: a software framework for nonlinear optimization and optimal control [J].

Andersson, Joel A. E. ;

Gillis, Joris ;

Horn, Greg ;

Rawlings, James B. ;

Diehl, Moritz .

MATHEMATICAL PROGRAMMING COMPUTATION, 2019, 11 (01) :1-36

[2]

[Anonymous], 2013, PROC IEEE 78 VEH TEC

[3] A neural network solution for fixed-final time optimal control of nonlinear systems [J].

Cheng, Tao ;

Lewis, Frank L. ;

Abu-Khalaf, Murad .

AUTOMATICA, 2007, 43 (03) :482-490

[4]

Dong L., 2018, IEEE T CYBERNETICS

[5] Tube-based robust sampled-data MPC for linear continuous-time systems [J].

Farina, Marcello ;

Scattolini, Riccardo .

AUTOMATICA, 2012, 48 (07) :1473-1476

[6] Finite-Horizon Control-Constrained Nonlinear Optimal Control Using Single Network Adaptive Critics [J].

Heydari, Ali ;

Balakrishnan, Sivasubramanya N. .

IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2013, 24 (01) :145-157

[7] Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics [J].

Jiang, Yu ;

Jiang, Zhong-Ping .

AUTOMATICA, 2012, 48 (10) :2699-2704

[8] Finite horizon optimal tracking control of partially unknown linear continuous-time systems using policy iteration [J].

Li, Chao ;

Liu, Derong ;

Li, Hongliang .

IET CONTROL THEORY AND APPLICATIONS, 2015, 9 (12) :1791-1801

[9] Neural-Network-Based Online HJB Solution for Optimal Robust Guaranteed Cost Control of Continuous-Time Uncertain Nonlinear Systems [J].

Liu, Derong ;

Wang, Ding ;

Wang, Fei-Yue ;

Li, Hongliang ;

Yang, Xiong .

IEEE TRANSACTIONS ON CYBERNETICS, 2014, 44 (12) :2834-2847

[10] Model predictive control of continuous-time nonlinear systems with piecewise constant control [J].

Magni, L ;

Scattolini, R .

IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2004, 49 (06) :900-906

← 1 2 →