Model Predictive Control-Based Value Estimation for Efficient Reinforcement Learning

被引:0
作者
Wu, Qizhen [1 ]
Liu, Kexin [1 ]
Chen, Lei [2 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China
[2] Beijing Inst Technol, Adv Res Inst Multidisciplinary Sci, Beijing 100081, Peoples R China
基金
国家重点研发计划; 美国国家科学基金会;
关键词
Predictive models; Neural networks; Trajectory; Data models; Computational modeling; Training; Optimization; Reinforcement learning; Predictive control;
D O I
10.1109/MIS.2024.3386204
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) suffers from limitations in real practices primarily due to the number of required interactions with virtual environments. It results in a challenging problem because we are implausible to obtain a local optimal strategy with only a few attempts for many learning methods. Hereby, we design an improved RL method based on model predictive control that models the environment through a data-driven approach. Based on the learned environment model, it performs multistep prediction to estimate the value function and optimize the policy. The method demonstrates higher learning efficiency, faster convergent speed of strategies tending to the local optimal value, and less sample capacity space required by experience replay buffers. Experimental results, both in classic databases and in a dynamic obstacle-avoidance scenario for an unmanned aerial vehicle, validate the proposed approaches.
引用
收藏
页码:63 / 72
页数:10
相关论文
共 20 条
  • [1] Buckman J, 2018, ADV NEUR IN, V31
  • [2] UAV path planning using artificial potential field method updated by optimal control theory
    Chen, Yong-bo
    Luo, Guan-chen
    Mei, Yue-song
    Yu, Jian-qiao
    Su, Xiao-long
    [J]. INTERNATIONAL JOURNAL OF SYSTEMS SCIENCE, 2016, 47 (06) : 1407 - 1420
  • [3] Chua K, 2018, ADV NEUR IN, V31
  • [4] Filtered Probabilistic Model Predictive Control-Based Reinforcement Learning for Unmanned Surface Vehicles
    Cui, Yunduan
    Peng, Lei
    Li, Huiyun
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2022, 18 (10) : 6950 - 6961
  • [5] Gao L., 2023, P 40 INT C MACH LEAR, V202, P10835
  • [6] MODEL PREDICTIVE CONTROL - THEORY AND PRACTICE - A SURVEY
    GARCIA, CE
    PRETT, DM
    MORARI, M
    [J]. AUTOMATICA, 1989, 25 (03) : 335 - 348
  • [7] Data-Driven Economic NMPC Using Reinforcement Learning
    Gros, Sebastien
    Zanon, Mario
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (02) : 636 - 648
  • [8] Hansen N, 2022, PR MACH LEARN RES
  • [9] Janner M, 2019, ADV NEUR IN, V32
  • [10] Reinforcement learning in robotics: A survey
    Kober, Jens
    Bagnell, J. Andrew
    Peters, Jan
    [J]. INTERNATIONAL JOURNAL OF ROBOTICS RESEARCH, 2013, 32 (11) : 1238 - 1274