Control Law Learning Based on LQR Reconstruction With Inverse Optimal Control

被引:0
作者
Qu, Chendi [1 ]
He, Jianping [1 ]
Duan, Xiaoming [1 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
基金
上海市自然科学基金; 中国国家自然科学基金;
关键词
Trajectory; Optimization; Linear programming; Estimation; Object recognition; Mobile agents; Mathematical models; Sensitivity analysis; Search problems; Robots; Inverse optimal control; linear systems; LQR control; HORIZON;
D O I
10.1109/TAC.2024.3469788
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Designing controllers to generate various trajectories has been studied for years, while recently, recovering an optimal controller from trajectories receives increasing attention. In this article, we reveal that the inherent linear quadratic regulator (LQR) problem of a moving agent can be reconstructed based on its trajectory observations only, which enables one to learn the control law of the target agent autonomously. Specifically, we propose a novel inverse optimal control method to identify the weighting matrices of a discrete-time finite horizon LQR, and we also provide the corresponding identifiability conditions. Then, we obtain the optimal estimate of the control horizon using binary search, and finally, reconstruct the LQR problem with aforementioned estimates. The strength of the learning control law with optimization problem recovery lies in less computation consumption and strong generalization ability. We apply our algorithm to the future control input prediction and the discrepancy loss is further derived. Simulations and hardware experiments on a self-designed robot platform illustrate the effectiveness of our work.
引用
收藏
页码:1350 / 1357
页数:8
相关论文
共 27 条
[1]   From inverse optimal control to inverse reinforcement learning: A historical review [J].
Ab Azar, Nematollah ;
Shahmansoorian, Aref ;
Davoudi, Mohsen .
ANNUAL REVIEWS IN CONTROL, 2020, 50 :119-138
[2]  
Anderson B. D., 2007, Optimal control: linear quadratic methods
[3]   DYNAMIC PROGRAMMING [J].
BELLMAN, R .
SCIENCE, 1966, 153 (3731) :34-&
[4]   Reinforcement Learning of the Prediction Horizon in Model Predictive Control [J].
Bohn, Eivind ;
Gros, Sebastien ;
Moe, Signe ;
Johansen, Tor Arne .
IFAC PAPERSONLINE, 2021, 54 (06) :314-320
[5]   Convergent Properties of Riccati Equation with Application to Stability Analysis of State Estimation [J].
Cai, X. ;
Ding, Y. S. ;
Li, S. Y. .
MATHEMATICAL PROBLEMS IN ENGINEERING, 2017, 2017
[6]  
Chen J, 2016, 2016 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS 2016), P446, DOI 10.1109/IROS.2016.7759092
[7]  
Ding XD, 2021, Arxiv, DOI arXiv:2103.04391
[8]  
Facchinei F., 2003, Finite-Dimensional Variational Inequalities and Complementarity Problems
[9]   Learning From Sparse Demonstrations [J].
Jin, Wanxin ;
Murphey, Todd D. ;
Kulic, Dana ;
Ezer, Neta ;
Mou, Shaoshuai .
IEEE TRANSACTIONS ON ROBOTICS, 2023, 39 (01) :645-664
[10]  
Keshavarz A, 2011, 2011 IEEE INTERNATIONAL SYMPOSIUM ON INTELLIGENT CONTROL (ISIC), P613, DOI 10.1109/ISIC.2011.6045410