Adaptive optimal output tracking of continuous-time systems via output-feedback-based reinforcement learning

被引:26
作者
Chen, Ci [1 ,2 ]
Xie, Lihua [3 ]
Xie, Kan [1 ,4 ]
Lewis, Frank L. [5 ]
Xie, Shengli [6 ,7 ]
机构
[1] Guangdong Univ Technol, Sch Automat, Guangzhou, Peoples R China
[2] Guangdong Key Lab IoT Informat Technol, Guangzhou, Peoples R China
[3] Nanyang Technol Univ, Sch Elect & Elect Engn, Singapore, Singapore
[4] 111 Ctr Intelligent Batch Mfg Based IoT Technol, Guangzhou, Peoples R China
[5] Univ Texas Arlington, UTA Res Inst, Ft Worth, TX USA
[6] Minist Educ, Key Lab Intelligent Informat Proc & Syst Integrat, Guangzhou, Peoples R China
[7] Guangdong HongKong Macao Joint Lab Smart Discrete, Guangzhou, Peoples R China
基金
中国国家自然科学基金;
关键词
Reinforcement learning; Off-policy; Output tracking; Output feedback; Adaptive optimal control; LINEAR-SYSTEMS;
D O I
10.1016/j.automatica.2022.110581
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Reinforcement learning provides a powerful tool for designing a satisfactory controller through interactions with the environment. Although off-policy learning algorithms were recently designed for tracking problems, most of these results either are full-state feedback or have bounded control errors, which may not be flexible or desirable for engineering problems in the real world. To address these problems, we propose an output-feedback-based reinforcement learning approach that allows us to find the optimal control solution using input-output data and ensure the asymptotic tracking control of continuous-time systems. More specifically, we first propose a dynamical controller revised from the standard output regulation theory and use it to formulate an optimal output tracking problem. Then, a state observer is used to re-express the system state. Consequently, we address the rank issue of the parameterization matrix and analyze the state re-expression error that are crucial for transforming the off-policy learning into an output-feedback form. A comprehensive simulation study is given to demonstrate the effectiveness of the proposed approach.(c) 2022 Elsevier Ltd. All rights reserved.
引用
收藏
页数:14
相关论文
共 48 条
  • [1] Bertsekas D. P., 1995, DYNAMIC PROGRAMMING, V1
  • [2] Chen C.-T., 1999, LINEAR SYSTEM THEORY
  • [3] Chen Ci, 2021, Journal of Guangdong University of Technology, V38, P29, DOI 10.12052/gdutxb.210105
  • [4] Homotopic policy iteration-based learning design for unknown linear continuous-time systemsx2729;
    Chen, Ci
    Lewis, Frank L.
    Li, Bo
    [J]. AUTOMATICA, 2022, 138
  • [5] Off-policy learning for adaptive optimal output synchronization of heterogeneous multi-agent systems
    Chen, Ci
    Lewis, Frank L.
    Xie, Kan
    Xie, Shengli
    Liu, Yilu
    [J]. AUTOMATICA, 2020, 119
  • [6] Reinforcement Learning-Based Adaptive Optimal Exponential Tracking Control of Linear Systems With Unknown Dynamics
    Chen, Ci
    Modares, Hamidreza
    Xie, Kan
    Lewis, Frank L.
    Wan, Yan
    Xie, Shengli
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2019, 64 (11) : 4423 - 4438
  • [7] NEURAL NETWORKS FOR SOLVING SYSTEMS OF LINEAR-EQUATIONS AND RELATED PROBLEMS
    CICHOCKI, A
    UNBEHAUEN, R
    [J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-FUNDAMENTAL THEORY AND APPLICATIONS, 1992, 39 (02): : 124 - 138
  • [8] Fan JL, 2018, INT C ADV MECH SYST, P207, DOI 10.1109/ICAMechS.2018.8506843
  • [9] Adaptive Dynamic Programming and Cooperative Output Regulation of Discrete-time Multi-agent Systems
    Gao, Weinan
    Liu, Yiyang
    Odekunle, Adedapo
    Yu, Yunjun
    Lu, Pingli
    [J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2018, 16 (05) : 2273 - 2281
  • [10] Adaptive Dynamic Programming and Adaptive Optimal Output Regulation of Linear Systems
    Gao, Weinan
    Jiang, Zhong-Ping
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2016, 61 (12) : 4164 - 4169