Output regulation of unknown linear systems using average cost reinforcement learning

被引：23

作者：

Yaghmaie, Farnaz Adib ^{[1
]}

Gunnarsson, Svante ^{[1
]}

Lewis, Frank L. ^{[2
,3
]}

机构：

[1] Linkoping Univ, Dept Elect Engn, Linkoping, Sweden

[2] Univ Texas Arlington, Res Inst, Arlington, TX 76019 USA

[3] Northeastern Univ, Shenyang, Liaoning, Peoples R China

来源：

AUTOMATICA | 2019年 / 110卷

关键词：

Output regulation; Reinforcement learning; Linear systems; Optimal control; OPTIMAL TRACKING CONTROL; CONTINUOUS-TIME SYSTEMS;

D O I：

10.1016/j.automatica.2019.108549

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

In this paper, we introduce an optimal average cost learning framework to solve output regulation problem for linear systems with unknown dynamics. Our optimal framework aims to design the controller to achieve output tracking and disturbance rejection while minimizing the average cost. We derive the Hamilton-Jacobi-Bellman (HJB) equation for the optimal average cost problem and develop a reinforcement algorithm to solve it. Our proposed algorithm is an off-policy routine which learns the optimal average cost solution completely model-free. We rigorously analyze the convergence of the proposed algorithm. Compared to previous approaches for optimal tracking controller design, we elevate the need for judicious selection of the discounting factor and the proposed algorithm can be implemented completely model-free. We support our theoretical results with a simulation example. (C) 2019 Elsevier Ltd. All rights reserved.

引用

页数：7

共 23 条

[1] [Anonymous], 1985, Matrix Analysis
[2] [Anonymous], 2013, NONLINEAR CONTROL SY
[3] [Anonymous], 1998, REINFORCEMENT LEARNI
[4] Approximate solutions to the time-invariant Hamilton-Jacobi-Bellman equation
Beard, RW
Saridis, GN
Wen, JT
[J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1998, 96 (03) : 589 - 626
[5] Bellman R., 1958, Dynamic Programming
[6] Bertsekas D. P., 2005, Dynamic Programming and Optimal Control, V1
[7] Bertsekas DP, 1995, PROCEEDINGS OF THE 34TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, P560, DOI 10.1109/CDC.1995.478953
[8] Adaptive Dynamic Programming and Adaptive Optimal Output Regulation of Linear Systems
Gao, Weinan
Jiang, Zhong-Ping
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) : 4164 - 4169
[9] Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear systems using iterative ADP algorithm
Huang, Yuzhu
Liu, Derong
[J]. NEUROCOMPUTING, 2014, 125 : 46 - 56
[10] Approximate optimal trajectory tracking for continuous-time nonlinear systems
Kamalapurkar, Rushikesh
Dinh, Huyen
Bhasin, Shubhendu
Dixon, Warren E.
[J]. AUTOMATICA, 2015, 51 : 40 - 48

← 1 2 3 →