共 43 条
Model-Free Q-Learning for the Tracking Problem of Linear Discrete-Time Systems
被引:13
作者:
Li, Chun
[1
]
Ding, Jinliang
[1
]
Lewis, Frank L.
[2
]
Chai, Tianyou
[1
]
机构:
[1] Northeastern Univ, State Key Lab Synthet Automat Proc Ind, Shenyang 110819, Peoples R China
[2] Univ Texas Arlington, Res Inst, Ft Worth, TX 76118 USA
基金:
中国国家自然科学基金;
关键词:
Desired control input;
iterative criteria;
model-free;
Q-learning;
tracking problem;
OPTIMAL OUTPUT REGULATION;
ADAPTIVE OPTIMAL-CONTROL;
D O I:
10.1109/TNNLS.2022.3195357
中图分类号:
TP18 [人工智能理论];
学科分类号:
081104 ;
0812 ;
0835 ;
1405 ;
摘要:
In this article, a model-free Q-learning algorithm is proposed to solve the tracking problem of linear discrete-time systems with completely unknown system dynamics. To eliminate tracking errors, a performance index of the Q-learning approach is formulated, which can transform the tracking problem into a regulation one. Compared with the existing adaptive dynamic programming (ADP) methods and Q-learning approaches, the proposed performance index adds a product term composed of a gain matrix and the reference tracking trajectory to the control input quadratic form. In addition, without requiring any prior knowledge of the dynamics of the original controlled system and command generator, the control policy obtained by the proposed approach can be deduced by an iterative technique relying on the online information of the system state, the control input, and the reference tracking trajectory. In each iteration of the proposed method, the desired control input can he updated by the iterative criteria derived from a precondition of the controlled system and the reference tracking trajectory, which ensures that the obtained control policy can eliminate tracking errors in theory. Moreover, to effectively use less data to obtain the optimal control policy, the off-policy approach is introduced into the proposed algorithm. Finally, the effectiveness of the proposed algorithm is verified by a numerical simulation.
引用
收藏
页码:3191 / 3201
页数:11
相关论文