Cooperative Model-Based Reinforcement Learning for Approximate Optimal Tracking

被引：0

作者：

Greene, Max L. ^{[1
]}

Bell, Zachary, I ^{[2
]}

Nivison, Scott A. ^{[2
]}

How, Jonathan P. ^{[3
]}

Dixon, Warren E. ^{[1
]}

机构：

[1] Univ Florida, Dept Mech & Aerosp Engn, Gainesville, FL 32611 USA

[2] Air Force Res Lab, Munit Directorate, Eglin AFB, FL USA

[3] MIT, Dept Aeronaut & Astronaut, Cambridge, MA 02139 USA

来源：

2021 AMERICAN CONTROL CONFERENCE (ACC) | 2021年

关键词：

SYSTEMS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper provides an approximate online adaptive solution to the infinite-horizon optimal tracking problem for a set of agents with homogeneous dynamics and common tracking objectives. Model-based reinforcement learning is implemented by simultaneously evaluating the Bellman error (BE) at the state of each agent and on nearby off-trajectory points, as needed, throughout the state space. Each agent will calculate and share their respective on and off-trajectory BE information with a centralized estimator, which computes updates for the approximate solution to the infinite-horizon optimal tracking problem and shares the estimate with the agents. In doing so, the computational burden associated with BE extrapolation is shared between the agents and a centralized updating resource. Edge computing is leveraged to share the computational load between the agents and a centralized resource. Uniformly ultimately bounded tracking of each agent's state to the desired state and convergence of the control policy to the neighborhood of the optimal policy is proven via a Lyapunov-like stability analysis.

引用

页码：1973 / 1978

页数：6

共 50 条

[31] A Gradient-Based Reinforcement Learning Algorithm for Multiple Cooperative Agents
Zhang, Zhen
Wang, Dongqing
Zhao, Dongbin
Han, Qiaoni
Song, Tingting
IEEE ACCESS, 2018, 6 : 70223 - 70235
[32] MBRL-MC: An HVAC Control Approach via Combining Model-Based Deep Reinforcement Learning and Model Predictive Control
Chen, Liangliang
Meng, Fei
Zhang, Ying
IEEE INTERNET OF THINGS JOURNAL, 2022, 9 (19) : 19160 - 19173
[33] Processing speed enhances model-based over model-free reinforcement learning in the presence of high working memory functioning
Schad, Daniel J.
Juenger, Elisabeth
Sebold, Miriam
Garbusow, Maria
Bernhardt, Nadine
Javadi, Amir-Homayoun
Zimmermann, Ulrich S.
Smolka, Michael N.
Heinz, Andreas
Rapp, Michael A.
Huys, Quentin J. M.
FRONTIERS IN PSYCHOLOGY, 2014, 5
[34] Formation cooperative trajectory tracking control for unmanned aerial vehicles via differential game and reinforcement learning
Wang, Xiaoheng
Xiao, Zhihe
Ren, Ziming
Dong, Chunzhu
Tian, Xuan Dan
TRANSACTIONS OF THE INSTITUTE OF MEASUREMENT AND CONTROL, 2024,
[35] Optimal switching target-assignment based on the integral performance in cooperative tracking
Yao Yu
Zhang Peng
Liu Hugh
He FengHua
SCIENCE CHINA-INFORMATION SCIENCES, 2013, 56 (01) : 1 - 14
[36] The optimal experiment? Influence of solution strategies on model-based optimal experimental design
Kozachynskyi, Volodymyr
Illner, Markus
Esche, Erik
Repke, Jens -Uwe
COMPUTERS & CHEMICAL ENGINEERING, 2024, 187
[37] An Improved Reinforcement Learning Based Heuristic Dynamic Programming Algorithm for Model-Free Optimal Control
Li, Jia
Yuan, Zhaolin
Ban, Xiaojuan
ARTIFICIAL NEURAL NETWORKS AND MACHINE LEARNING, ICANN 2020, PT II, 2020, 12397 : 282 - 294
[38] Whole building energy model for HVAC optimal control: A practical framework based on deep reinforcement learning
Zhang, Zhiang
Chong, Adrian
Pan, Yuqi
Zhang, Chenlu
Lam, Khee Poh
ENERGY AND BUILDINGS, 2019, 199 : 472 - 490
[39] Two-order cooperative optimization of swarm control based on reinforcement learning
Yu, Dengxiu
Qin, Zhenhao
Chen, Kang
Cheong, Kang Hao
Chen, C. L. Philip
IET CONTROL THEORY AND APPLICATIONS, 2024, 18 (01) : 125 - 136
[40] Primal-Dual Reinforcement Learning for Zero-Sum Games in the Optimal Tracking Control
Que, Xuejie
Wang, Zhenlei
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS, 2024, 71 (06) : 3146 - 3150

← 1 2 3 4 5 →