Linear quadratic optimal control method based on output feedback inverse reinforcement Q-learning

被引：0

作者：

Liu, Wen ^{[1
]}

Fan, Jia-Lu ^{[1
]}

Xue, Wen-Qian ^{[1
]}

机构：

[1] State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Liaoning, Shenyang

来源：

Kongzhi Lilun Yu Yingyong/Control Theory and Applications | 2024年 / 41卷 / 08期

基金：

中国国家自然科学基金;

关键词：

data-driven optimal control; inverse reinforcement learning; output feedback; Q-learning;

D O I：

10.7641/CTA.2023.20551

中图分类号：

学科分类号：

摘要：

In this paper, a data-driven output feedback optimal control method using inverse reinforcement Q-learning for linear quadratic optimal control problem of linear discrete–time systems with unknown model parameters and unmeasurable states is proposed. Only input and output data are used to adaptively determine the values of appropriate quadratic performance index weights and optimal control law, so that the system exhibits the same trajectories as the reference trajectories. Firstly, an equation for parameter correction is proposed, by combining which with inverse optimal control, a model-based inverse reinforcement learning based optimal control method framework is proposed to compute the correction of the output feedback control law and performance index weights. On this basis, this paper introduces the idea of reinforcement Q-learning and a data-driven output feedback inverse reinforcement Q-learning optimal control method is eventually proposed, which does not require system model parameters, but uses only historical input and output data to solve output feedback control law parameter and performance index weights. The theoretical analysis and simulation experiments are provided to verify the effectiveness of the proposed method. © 2024 South China University of Technology. All rights reserved.

引用

页码：1469 / 1479

页数：10

共 39 条

[1] QIANG J, MODARES H, LEWIS F L, Et al., Distributed L2-gain output-feedback control of homogeneous and heterogeneous systems, Automatica Oxford, 71, pp. 361-368, (2016)
[2] DONGE V S, LIAN B S, LEWIS F L, Et al., Multi-agent graphical games with inverse reinforcement learning, IEEE Transactions on Control of Network Systems, 10, 2, pp. 841-852, (2023)
[3] ZHANG Q, ZHAO D, LEWIS F L., Model-free reinforcement learning for fully cooperative multi-agent graphical games, International Joint Conference on Neural Networks (IJCNN), pp. 3943-3948, (2018)
[4] LONG M K, SU H H, ZENG Z Z., Model-free algorithms for containment control of sturated discrete-Time multiagent systems via Qlearning method, IEEE Transactions on Systems, Man, and Cybernetics: Systems, 52, 2, pp. 1308-1316, (2022)
[5] CHOI S, KIM S, JIN K H., Inverse reinforcement learning control for trajectory tracking of a multirotor UAV, International Journal of Control, Automation and Systems, 15, pp. 1826-1834, (2017)
[6] ABBEEL P, NG A Y., Apprenticeship learning via inverse reinforcement learning, International Conference on Machine Learning (ICM-L), pp. 1-8, (2004)
[7] SHAO Z, JOO E M., A review of inverse reinforcement learning theory and recent advances, IEEE Congress on Evolutionary Computation, pp. 1-8, (2012)
[8] ZHAO Dongbin, SHAO Kun, ZHU Yuanheng, Et al., Review of deep reinforcement learning and discussions on the development of computer go, Control Theory & Applications, 33, 6, pp. 701-717, (2016)
[9] MODARES H, LEWIS F L., Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning, IEEE Transactions on Automatic Control, 59, 11, pp. 2051-2056, (2014)
[10] LI J N, YUAN D C, DING Z T., Optimal tracking control for discrete-time systems by model-free off-policy Q-learning approach, The 11th Asian Control Conference (ASCC), pp. 7-12, (2017)

← 1 2 3 4 →