Model Predictive Control-Based Value Estimation for Efficient Reinforcement Learning

被引:0
作者
Wu, Qizhen [1 ]
Liu, Kexin [1 ]
Chen, Lei [2 ]
机构
[1] Beihang Univ, Sch Automat Sci & Elect Engn, Beijing 100191, Peoples R China
[2] Beijing Inst Technol, Adv Res Inst Multidisciplinary Sci, Beijing 100081, Peoples R China
基金
国家重点研发计划; 美国国家科学基金会;
关键词
Predictive models; Neural networks; Trajectory; Data models; Computational modeling; Training; Optimization; Reinforcement learning; Predictive control;
D O I
10.1109/MIS.2024.3386204
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning (RL) suffers from limitations in real practices primarily due to the number of required interactions with virtual environments. It results in a challenging problem because we are implausible to obtain a local optimal strategy with only a few attempts for many learning methods. Hereby, we design an improved RL method based on model predictive control that models the environment through a data-driven approach. Based on the learned environment model, it performs multistep prediction to estimate the value function and optimize the policy. The method demonstrates higher learning efficiency, faster convergent speed of strategies tending to the local optimal value, and less sample capacity space required by experience replay buffers. Experimental results, both in classic databases and in a dynamic obstacle-avoidance scenario for an unmanned aerial vehicle, validate the proposed approaches.
引用
收藏
页码:63 / 72
页数:10
相关论文
共 20 条
  • [11] Kostrikov I., 2021, P INT C LEARN REPR I
  • [12] Lillicrap T. P., 2019, arXiv, DOI DOI 10.48550/ARXIV.1509.02971
  • [13] A Heuristic Planning Reinforcement Learning-Based Energy Management for Power-Split Plug-in Hybrid Electric Vehicles
    Liu, Teng
    Hu, Xiaosong
    Hu, Weihao
    Zou, Yuan
    [J]. IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2019, 15 (12) : 6436 - 6445
  • [14] Mnih V, 2013, Arxiv, DOI arXiv:1312.5602
  • [15] Pan F. Y., 2020, P ADV NEUR INF PROC, V33, P546
  • [16] Sutton R. S., 1990, Machine Learning: Proceedings of the Seventh International Conference (1990), P216
  • [17] Sutton RS, 2018, ADAPT COMPUT MACH LE, P1
  • [18] Challenges and Opportunities of Applying Reinforcement Learning to Autonomous Racing
    Wurman, Peter R.
    Stone, Peter
    Spranger, Michael
    [J]. IEEE INTELLIGENT SYSTEMS, 2022, 37 (03) : 20 - 23
  • [19] Distributed adaptive cooperative time-varying formation tracking guidance for multiple aerial vehicles system
    Yu, Jianglong
    Dong, Xiwang
    Li, Qingdong
    Lv, Jinhu
    Ren, Zhang
    [J]. AEROSPACE SCIENCE AND TECHNOLOGY, 2021, 117
  • [20] Zhao WS, 2020, 2020 IEEE SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE (SSCI), P737, DOI [10.1109/SSCI47803.2020.9308468, 10.1109/ssci47803.2020.9308468]