Optimal tracking agent: a new framework of reinforcement learning for multiagent systems

被引：3

作者：

Cao, Weihua ^{[1
]}

Chen, Gang ^{[1
]}

Chen, Xin ^{[1
]}

Wu, Min ^{[1
]}

机构：

[1] Cent South Univ, Inst Adv Control & Intelligent Automat, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2013年 / 25卷 / 14期

基金：

高等学校博士学科点专项科研基金;

关键词：

estimator; action selection mechanism; curse of dimensionality; optimal tracking agent; multiagent systems;

D O I：

10.1002/cpe.2870

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

SUMMARYThe curse of dimensionality is a ubiquitous problem for multiagent reinforcement learning, which means the learning and storing space grows exponentially with the number of agents and hinders the application of multiagent reinforcement learning. To relieve this problem, we propose a new framework named as optimal tracking agent (OTA). The OTA views the other agents as part of the environment and uses a reduced form to learn the optimal decision. Although merging other agents into the environment may reduce the dimension of action space, the environment characterized by such form is dynamic and does not satisfy the convergence of reinforcement learning (RL). Thus, we develop an estimator to track the dynamics of the environment. The estimator obtains the dynamic model, and then the model-based RL can be used to react to the dynamic environment optimally. Because the Q-function in OTA is also a dynamic process because of other agents' dynamics, different from traditional RL, in which the learning is a stationary process and the usual action selection mechanisms just suit to such stationary process, we improve the greedy action selection mechanism to adapt to such dynamics. Thus, the OTA will have convergence. An experiment illustrates the validity and efficiency of the OTA.Copyright (c) 2012 John Wiley & Sons, Ltd.

引用

页码：2002 / 2015

页数：14

共 50 条

[21] Beyond Reinforcement Learning and Local View in Multiagent Systems
Bazzan, Ana L. C.
KUNSTLICHE INTELLIGENZ, 2014, 28 (03): : 179 - 189
[22] Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning
Peng, Zhinan
Luo, Rui
Hu, Jiangping
Shi, Kaibo
Nguang, Sing Kiong
Ghosh, Bijoy Kumar
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 4043 - 4055
[23] Prior Knowledge-Augmented Broad Reinforcement Learning Framework for Fault Diagnosis of Heterogeneous Multiagent Systems
Guo, Li
Ren, Yiran
Li, Runze
Jiang, Bin
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (01) : 115 - 123
[24] Integral Reinforcement-Learning-Based Optimal Containment Control for Partially Unknown Nonlinear Multiagent Systems
Wu, Qiuye
Wu, Yongheng
Wang, Yonghua
ENTROPY, 2023, 25 (02)
[25] Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning
Qin, Jiahu
Li, Man
Shi, Yang
Ma, Qichao
Zheng, Wei Xing
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (01) : 85 - 96
[26] Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
Hao, Jianye
Yang, Tianpei
Tang, Hongyao
Bai, Chenjia
Liu, Jinyi
Meng, Zhaopeng
Liu, Peng
Wang, Zhen
IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 8762 - 8782
[27] Reinforcement learning intermittent optimal formation control for multi-agent systems with disturbances
Liu, Erliang
Miao, Guoying
Hu, Jingyu
MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (12)
[28] Adaptive Fault-Tolerant Tracking Control for Discrete-Time Multiagent Systems via Reinforcement Learning Algorithm
Li, Hongyi
Wu, Ying
Chen, Mou
IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1163 - 1174
[29] Optimal iterative learning control design for multi-agent systems consensus tracking
Yang, Shiping
Xu, Jian-Xin
Huang, Deqing
Tan, Ying
SYSTEMS & CONTROL LETTERS, 2014, 69 : 80 - 89
[30] Inverse Reinforcement Learning for Decentralized Non-Cooperative Multiagent Systems
Reddy, Tummalapalli Sudhamsh
Gopikrishna, Vamsikrishna
Zaruba, Gergely
Huber, Manfred
PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 1930 - 1935

← 1 2 3 4 5 →