Q-learning Based Adaptive Optimal Control for Linear Quadratic Tracking Problem

被引：1

作者：

Sharma, Shashi Kant ^{[1
]}

Jha, Sumit Kumar ^{[1
]}

Dhawan, Amit ^{[1
]}

Tiwari, Manish ^{[1
]}

机构：

[1] MNNIT Allahabad, Dept Elect & Commun Engn, Prayagraj 211004, India

来源：

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS | 2023年 / 21卷 / 08期

关键词：

Adaptive optimal control; algebraic Riccati equation; linear quadratic tracking; Q-learning; CONTINUOUS-TIME SYSTEMS;

D O I：

10.1007/s12555-022-0364-5

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper describes a Q-learning based algorithm to design the linear quadratic tracker (LQT) for linear time invariant (LTI) continuous-time systems with partially unknown dynamics. The proposed approach uses a fixed-point equation in terms of Q-function in order to estimate the unknown optimal gain parameters. The fixedpoint equation, which is derived by applying the Pontryagin's minimum principle in Q-learning, is based on the modified algebraic Riccati equation (ARE) for LQT problem. The online adaptation of the optimal parameters are achieved by using the gradient descent based parameter update laws by minimizing the Bellman's error term which is derived from fixed-point equation mentioned earlier. A persistence of excitation condition has been used to establish the desired optimal convergence of the estimated control parameters. Simulation results have been shown to validate the efficiency of the proposed Q-learning approach.

引用

页码：2718 / 2725

页数：8

共 30 条

[1] [Anonymous], 2017, PROC IEEE S ADAPTIVE
[2] NECESSARY AND SUFFICIENT CONDITIONS FOR PARAMETER CONVERGENCE IN ADAPTIVE-CONTROL
BOYD, S
SASTRY, SS
[J]. AUTOMATICA, 1986, 22 (06) : 629 - 639
[3] BRADTKE SJ, 1994, PROCEEDINGS OF THE 1994 AMERICAN CONTROL CONFERENCE, VOLS 1-3, P3475
[4] Policy Iteration Based Online Adaptive Optimal Fault Compensation Control for Spacecraft
Du, Yanbin
Jiang, Bin
Ma, Yajie
[J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2021, 19 (04) : 1607 - 1617
[5] Edwards C., 1998, SLIDING MODE CONTROL, DOI 10.1201/9781498701822
[6] Optimal LQ Tracking Control for Continuous-time Systems with Point-wise Time-varying Input Delay
Han, Chunyan
Wang, Wei
[J]. INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS, 2017, 15 (05) : 2243 - 2252
[7] Hou DW, 2018, PROCEEDINGS OF 2018 IEEE 7TH DATA DRIVEN CONTROL AND LEARNING SYSTEMS CONFERENCE (DDCLS), P1039, DOI 10.1109/DDCLS.2018.8515964
[8] Jha S. K., 2014, P IEEE S AD DYN PROG
[9] Direct Adaptive Optimal Control for Uncertain Continuous-Time LTI Systems Without Persistence of Excitation
Jha, Sumit Kumar
Roy, Sayan Basu
Bhasin, Shubhendu
[J]. IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2018, 65 (12) : 1993 - 1997
[10] Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics
Jiang, Yu
Jiang, Zhong-Ping
[J]. AUTOMATICA, 2012, 48 (10) : 2699 - 2704

← 1 2 3 →