Output regulation of unknown linear systems using average cost reinforcement learning

被引:23
作者
Yaghmaie, Farnaz Adib [1 ]
Gunnarsson, Svante [1 ]
Lewis, Frank L. [2 ,3 ]
机构
[1] Linkoping Univ, Dept Elect Engn, Linkoping, Sweden
[2] Univ Texas Arlington, Res Inst, Arlington, TX 76019 USA
[3] Northeastern Univ, Shenyang, Liaoning, Peoples R China
关键词
Output regulation; Reinforcement learning; Linear systems; Optimal control; OPTIMAL TRACKING CONTROL; CONTINUOUS-TIME SYSTEMS;
D O I
10.1016/j.automatica.2019.108549
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we introduce an optimal average cost learning framework to solve output regulation problem for linear systems with unknown dynamics. Our optimal framework aims to design the controller to achieve output tracking and disturbance rejection while minimizing the average cost. We derive the Hamilton-Jacobi-Bellman (HJB) equation for the optimal average cost problem and develop a reinforcement algorithm to solve it. Our proposed algorithm is an off-policy routine which learns the optimal average cost solution completely model-free. We rigorously analyze the convergence of the proposed algorithm. Compared to previous approaches for optimal tracking controller design, we elevate the need for judicious selection of the discounting factor and the proposed algorithm can be implemented completely model-free. We support our theoretical results with a simulation example. (C) 2019 Elsevier Ltd. All rights reserved.
引用
收藏
页数:7
相关论文
共 23 条
  • [1] [Anonymous], 1985, Matrix Analysis
  • [2] [Anonymous], 2013, NONLINEAR CONTROL SY
  • [3] [Anonymous], 1998, REINFORCEMENT LEARNI
  • [4] Approximate solutions to the time-invariant Hamilton-Jacobi-Bellman equation
    Beard, RW
    Saridis, GN
    Wen, JT
    [J]. JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 1998, 96 (03) : 589 - 626
  • [5] Bellman R., 1958, Dynamic Programming
  • [6] Bertsekas D. P., 2005, Dynamic Programming and Optimal Control, V1
  • [7] Bertsekas DP, 1995, PROCEEDINGS OF THE 34TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-4, P560, DOI 10.1109/CDC.1995.478953
  • [8] Adaptive Dynamic Programming and Adaptive Optimal Output Regulation of Linear Systems
    Gao, Weinan
    Jiang, Zhong-Ping
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) : 4164 - 4169
  • [9] Neural-network-based optimal tracking control scheme for a class of unknown discrete-time nonlinear systems using iterative ADP algorithm
    Huang, Yuzhu
    Liu, Derong
    [J]. NEUROCOMPUTING, 2014, 125 : 46 - 56
  • [10] Approximate optimal trajectory tracking for continuous-time nonlinear systems
    Kamalapurkar, Rushikesh
    Dinh, Huyen
    Bhasin, Shubhendu
    Dixon, Warren E.
    [J]. AUTOMATICA, 2015, 51 : 40 - 48