Reinforcement Learning based on MPC/MHE for Unmodeled and Partially Observable Dynamics

被引：14

作者：

Esfahani, Hossein Nejatbakhsh ^{[1
]}

Kordabad, Arash Bahari ^{[1
]}

Gros, Sebastien ^{[1
]}

机构：

[1] Norwegian Univ Sci & Technol NTNU, Dept Engn Cybernet, Trondheim, Norway

来源：

2021 AMERICAN CONTROL CONFERENCE (ACC) | 2021年

关键词：

D O I：

10.23919/ACC50511.2021.9483399

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper proposes an observer-based framework for solving Partially Observable Markov Decision Processes (POMDPs) when an accurate model is not available. We first propose to use a Moving Horizon Estimation-Model Predictive Control (MHE-MPC) scheme in order to provide a policy for the POMDP problem, where the full state of the real process is not measured and necessarily known. We propose to parameterize both MPC and MHE formulations, where certain adjustable parameters are regarded for tuning the policy. In this paper, for the sake of tackling the unmodeled and partially observable dynamics, we leverage the Reinforcement Learning (RL) to tune the parameters of MPC and MHE schemes jointly, with the closed-loop performance of the policy as a goal rather than model fitting or the MHE performance. Illustrations show that the proposed approach can effectively increase the performance of close-loop control of systems formulated as POMDPs.

引用

页码：2121 / 2126

页数：6

共 21 条

[1] Azizzadenesheli K., 2019, THESIS
[2] Bahari Kordabad A., 2021, 2021 AM CONTR C ACC
[3] Bertsekas D., 2005, Athena Scientific
[4] Büskens C, 2001, ONLINE OPTIMIZATION OF LARGE SCALE SYSTEMS, P3
[5] Gangwani T., 2019, UAI
[6] Reinforcement Learning for mixed-integer problems based on MPC
Gros, Sebastien
Zanon, Mario
[J]. IFAC PAPERSONLINE, 2020, 53 (02): : 5219 - 5224
[7] Data-Driven Economic NMPC Using Reinforcement Learning
Gros, Sebastien
Zanon, Mario
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2020, 65 (02) : 636 - 648
[8] Guo Zhaohan Daniel, 2018, ARXIV181106407
[9] Data-Driven Distributed Output Consensus Control for Partially Observable Multiagent Systems
Jiang, He
He, Haibo
[J]. IEEE TRANSACTIONS ON CYBERNETICS, 2019, 49 (03) : 848 - 858
[10] Karg B., 2018, ARXIV180610644

← 1 2 3 →