Model Predictive Control-Based Value Estimation for Efficient Reinforcement Learning

被引：0

作者：

Wu, Qizhen ^{[1
]}

Liu, Kexin ^{[1
]}

Chen, Lei ^{[2
]}

机构：

[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China

[2] Beijing Inst Technol, Adv Res Inst Multidisciplinary Sci, Beijing 100081, Peoples R China

来源：

IEEE INTELLIGENT SYSTEMS | 2024年 / 39卷 / 03期

基金：

国家重点研发计划; 美国国家科学基金会;

关键词：

Predictive models; Neural networks; Trajectory; Data models; Computational modeling; Training; Optimization; Reinforcement learning; Predictive control;

D O I：

10.1109/MIS.2024.3386204

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning (RL) suffers from limitations in real practices primarily due to the number of required interactions with virtual environments. It results in a challenging problem because we are implausible to obtain a local optimal strategy with only a few attempts for many learning methods. Hereby, we design an improved RL method based on model predictive control that models the environment through a data-driven approach. Based on the learned environment model, it performs multistep prediction to estimate the value function and optimize the policy. The method demonstrates higher learning efficiency, faster convergent speed of strategies tending to the local optimal value, and less sample capacity space required by experience replay buffers. Experimental results, both in classic databases and in a dynamic obstacle-avoidance scenario for an unmanned aerial vehicle, validate the proposed approaches.

引用

页码：63 / 72

页数：10

共 20 条

[1] Buckman J, 2018, ADV NEUR IN, V31
[2] UAV path planning using artificial potential field method updated by optimal control theory
Chen, Yong-bo
Luo, Guan-chen
Mei, Yue-song
Yu, Jian-qiao
Su, Xiao-long
[J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2016, 47 (06) : 1407 - 1420
[3] Chua K, 2018, ADV NEUR IN, V31
[4] Filtered Probabilistic Model Predictive Control-Based Reinforcement Learning for Unmanned Surface Vehicles
Cui, Yunduan
Peng, Lei
Li, Huiyun
[J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) : 6950 - 6961
[5] Gao L., 2023, P 40 INT C MACH LEAR, V202, P10835
[6] MODEL PREDICTIVE CONTROL - THEORY AND PRACTICE - A SURVEY
GARCIA, CE
PRETT, DM
MORARI, M
[J]. AUTOMATICA, 1989, 25 (03) : 335 - 348
[7] Data-Driven Economic NMPC Using Reinforcement Learning
Gros, Sebastien
Zanon, Mario
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (02) : 636 - 648
[8] Hansen N, 2022, PR MACH LEARN RES
[9] Janner M, 2019, ADV NEUR IN, V32
[10] Reinforcement learning in robotics: A survey
Kober, Jens
Bagnell, J. Andrew
Peters, Jan
[J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1238 - 1274

← 1 2 →