Cooperative Model-Based Reinforcement Learning for Approximate Optimal Tracking

被引：0

作者：

Greene, Max L. ^{[1
]}

Bell, Zachary, I ^{[2
]}

Nivison, Scott A. ^{[2
]}

How, Jonathan P. ^{[3
]}

Dixon, Warren E. ^{[1
]}

机构：

[1] Univ Florida, Dept Mech & Aerosp Engn, Gainesville, FL 32611 USA

[2] Air Force Res Lab, Munit Directorate, Eglin AFB, FL USA

[3] MIT, Dept Aeronaut & Astronaut, Cambridge, MA 02139 USA

来源：

2021 AMERICAN CONTROL CONFERENCE (ACC) | 2021年

关键词：

SYSTEMS;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper provides an approximate online adaptive solution to the infinite-horizon optimal tracking problem for a set of agents with homogeneous dynamics and common tracking objectives. Model-based reinforcement learning is implemented by simultaneously evaluating the Bellman error (BE) at the state of each agent and on nearby off-trajectory points, as needed, throughout the state space. Each agent will calculate and share their respective on and off-trajectory BE information with a centralized estimator, which computes updates for the approximate solution to the infinite-horizon optimal tracking problem and shares the estimate with the agents. In doing so, the computational burden associated with BE extrapolation is shared between the agents and a centralized updating resource. Edge computing is leveraged to share the computational load between the agents and a centralized resource. Uniformly ultimately bounded tracking of each agent's state to the desired state and convergence of the control policy to the neighborhood of the optimal policy is proven via a Lyapunov-like stability analysis.

引用

页码：1973 / 1978

页数：6

共 50 条

[41] Toward on-sky adaptive optics control using reinforcement learning Model-based policy optimization for adaptive optics
Nousiainen, J.
Rajani, C.
Kasper, M.
Helin, T.
Haffert, S. Y.
Verinaud, C.
Males, J. R.
Van Gorkom, K.
Close, L. M.
Long, J. D.
Hedglen, A. D.
Guyon, O.
Schatz, L.
Kautz, M.
Lumbres, J.
Rodack, A.
Knight, J. M.
Miller, K.
ASTRONOMY & ASTROPHYSICS, 2022, 664
[42] Tree-based reinforcement learning for optimal water reservoir operation
Castelletti, A.
Galelli, S.
Restelli, M.
Soncini-Sessa, R.
WATER RESOURCES RESEARCH, 2010, 46
[43] Macroscopic model-based swarm guidance for a class of contaminant tracking applications
Ghanavati, Meysam
Chakravarthy, Animesh
Menon, Prathyush P.
INTERNATIONAL JOURNAL OF CONTROL, 2022, 95 (04) : 975 - 984
[44] Mouse tracking reveals structure knowledge in the absence of model-based choice
Konovalov, Arkady
Krajbich, Ian
NATURE COMMUNICATIONS, 2020, 11 (01)
[45] Transcranial Direct Current Stimulation of Right Dorsolateral Prefrontal Cortex Does Not Affect Model-Based or Model-Free Reinforcement Learning in Humans
Smittenaar, Peter
Prichard, George
FitzGerald, Thomas H. B.
Diedrichsen, Joern
Dolan, Raymond J.
PLOS ONE, 2014, 9 (01):
[46] Cognitive components underpinning the development of model-based learning
Potter, Tracey C. S.
Bryce, Nessa V.
Hartley, Catherine A.
DEVELOPMENTAL COGNITIVE NEUROSCIENCE, 2017, 25 : 272 - 280
[47] Visual Active Tracking Algorithm for UAV Cluster Based on Deep Reinforcement Learning
Hu, Runqiao
Wang, Shaofan
Li, Ke
PROCEEDINGS OF 2022 INTERNATIONAL CONFERENCE ON AUTONOMOUS UNMANNED SYSTEMS, ICAUS 2022, 2023, 1010 : 1047 - 1061
[48] Adaptive Optimal Tracking Control of an Underactuated Surface Vessel Using Actor-Critic Reinforcement Learning
Chen, Lin
Dai, Shi-Lu
Dong, Chao
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (06) : 7520 - 7533
[49] Event-Triggered Optimal Tracking Control for Underactuated Surface Vessels via Neural Reinforcement Learning
Liu, Xiang
Yan, Huaicheng
Zhou, Weixiang
Wang, Ning
Wang, Yueying
IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2024, 20 (11) : 12837 - 12847
[50] Safe resource management of non-cooperative microgrids based on deep reinforcement learning
Shademan, Mahdi
Karimi, Hamid
Jadid, Shahram
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2023, 126

← 1 2 3 4 5 →