Model-Based Reinforcement Learning With Probabilistic Ensemble Terminal Critics for Data-Efficient Control Applications

被引:4
|
作者
Park, Jonghyeok [1 ]
Jeon, Soo [2 ]
Han, Soohee [1 ]
机构
[1] Pohang Univ Sci & Technol, Dept Elect Engn & Convergence IT Engn, Pohang 37673, South Korea
[2] Univ Waterloo, Dept Mech & Mechatron Engn, Waterloo, ON N2L 3G1, Canada
基金
新加坡国家研究基金会;
关键词
Heuristic algorithms; Data models; Robots; Probabilistic logic; Computational modeling; Reliability; Reinforcement learning; Cartpole system; model-predictive controller (MPC); model-based reinforcement learning (RL); probabilistic ensemble terminal critics (PETC);
D O I
10.1109/TIE.2023.3331074
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This article proposes a data-efficient model-based reinforcement learning (RL) algorithm empowered by reliable future reward estimates achieved through a confidence-based probabilistic ensemble terminal critics (PETC). The proposed algorithm utilizes a model-predictive controller to choose an action that optimizes the sum of the near and distant future rewards for a given current state. Near future rewards with high confidence are determined directly from trained deterministic dynamics and reward models. Distant future rewards beyond these horizons are meticulously assessed using the proposed confidence-based PETC, which minimizes estimation errors inherent in the distant future and quantifies uncertainty confidence. Through such confidence-based guided actions, the proposed approach is expected to operate in a reliable, explainable, and data-efficient manner, consistently guiding the system to an optimal trajectory. A comparison with the existing state-of-the-art RL algorithms for eight DeepMind Control Suite tasks confirms the superior data efficiency of the proposed approach, which achieves an average cumulative reward of 761.2 in merely 500K steps, whereas the other algorithms score below 700.0. The proposed algorithm is also successfully applied to two real-world control applications, namely single- and double-cartpole swing-up tasks.
引用
收藏
页码:9470 / 9479
页数:10
相关论文
共 50 条
  • [1] DATA-EFFICIENT MODEL-BASED REINFORCEMENT LEARNING FOR ROBOT CONTROL
    Sun, Ming
    Gao, Yue
    Liu, Wei
    Li, Shaoyuan
    INTERNATIONAL JOURNAL OF ROBOTICS & AUTOMATION, 2021, 36 (04): : 211 - 218
  • [2] Data-Efficient Reinforcement Learning with Probabilistic Model Predictive Control
    Kamthe, Sanket
    Deisenroth, Marc Peter
    INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS, VOL 84, 2018, 84
  • [3] Data-Efficient Task Generalization via Probabilistic Model-Based Meta Reinforcement Learning
    Bhardwaj, Arjun
    Rothfuss, Jonas
    Sukhija, Bhavya
    As, Yarden
    Hutter, Marco
    Coros, Stelian
    Krause, Andreas
    IEEE ROBOTICS AND AUTOMATION LETTERS, 2024, 9 (04) : 3918 - 3925
  • [4] A Safe and Data-Efficient Model-Based Reinforcement Learning System for HVAC Control
    Ding, Xianzhong
    An, Zhiyu
    Rathee, Arya
    Du, Wan
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (07): : 8014 - 8032
  • [5] Data-efficient model-based reinforcement learning with trajectory discrimination
    Tuo Qu
    Fuqing Duan
    Junge Zhang
    Bo Zhao
    Wenzhen Huang
    Complex & Intelligent Systems, 2024, 10 : 1927 - 1936
  • [6] Data-efficient model-based reinforcement learning with trajectory discrimination
    Qu, Tuo
    Duan, Fuqing
    Zhang, Junge
    Zhao, Bo
    Huang, Wenzhen
    COMPLEX & INTELLIGENT SYSTEMS, 2024, 10 (02) : 1927 - 1936
  • [7] Identifying Ordinary Differential Equations for Data-efficient Model-based Reinforcement Learning
    Nagel, Tobias
    Huber, Marco F.
    2024 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN 2024, 2024,
  • [8] Model-Based Data-Efficient Reinforcement Learning for Active Pantograph Control in High-Speed Railways
    Wang, Hui
    Liu, Zhigang
    Wang, Xufan
    Meng, Xiangyu
    Wu, Yanbo
    Han, Zhiwei
    IEEE TRANSACTIONS ON TRANSPORTATION ELECTRIFICATION, 2024, 10 (02): : 2701 - 2712
  • [9] Data-Efficient Hierarchical Reinforcement Learning for Robotic Assembly Control Applications
    Hou, Zhimin
    Fei, Jiajun
    Deng, Yuelin
    Xu, Jing
    IEEE TRANSACTIONS ON INDUSTRIAL ELECTRONICS, 2021, 68 (11) : 11565 - 11575
  • [10] Data-Efficient Reinforcement Learning for Malaria Control
    Zou, Lixin
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 507 - 513