Optimal tracking agent: a new framework of reinforcement learning for multiagent systems

被引：3

作者：

Cao, Weihua ^{[1
]}

Chen, Gang ^{[1
]}

Chen, Xin ^{[1
]}

Wu, Min ^{[1
]}

机构：

[1] Cent South Univ, Inst Adv Control & Intelligent Automat, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China

来源：

CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE | 2013年 / 25卷 / 14期

基金：

高等学校博士学科点专项科研基金;

关键词：

estimator; action selection mechanism; curse of dimensionality; optimal tracking agent; multiagent systems;

D O I：

10.1002/cpe.2870

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

SUMMARYThe curse of dimensionality is a ubiquitous problem for multiagent reinforcement learning, which means the learning and storing space grows exponentially with the number of agents and hinders the application of multiagent reinforcement learning. To relieve this problem, we propose a new framework named as optimal tracking agent (OTA). The OTA views the other agents as part of the environment and uses a reduced form to learn the optimal decision. Although merging other agents into the environment may reduce the dimension of action space, the environment characterized by such form is dynamic and does not satisfy the convergence of reinforcement learning (RL). Thus, we develop an estimator to track the dynamics of the environment. The estimator obtains the dynamic model, and then the model-based RL can be used to react to the dynamic environment optimally. Because the Q-function in OTA is also a dynamic process because of other agents' dynamics, different from traditional RL, in which the learning is a stationary process and the usual action selection mechanisms just suit to such stationary process, we improve the greedy action selection mechanism to adapt to such dynamics. Thus, the OTA will have convergence. An experiment illustrates the validity and efficiency of the OTA.Copyright (c) 2012 John Wiley & Sons, Ltd.

引用

页码：2002 / 2015

页数：14

共 50 条

[1] Optimal Tracking Agent: A New Framework for Multi-Agent Reinforcement Learning
Cao, Weihua
Chen, Gang
Chen, Xin
Wu, Min
TRUSTCOM 2011: 2011 INTERNATIONAL JOINT CONFERENCE OF IEEE TRUSTCOM-11/IEEE ICESS-11/FCST-11, 2011, : 1328 - 1334
[2] Formation Tracking of Spatiotemporal Multiagent Systems: A Decentralized Reinforcement Learning Approach
Liu, Tianrun
Chen, Yang-Yang
IEEE SYSTEMS MAN AND CYBERNETICS MAGAZINE, 2024, 10 (04): : 52 - 60
[3] iEnsemble: A Framework for Committee Machine Based on Multiagent Systems with Reinforcement Learning
Uber Junior, Arnoldo
de Freitas Filho, Paulo Jose
Silveira, Ricardo Azambuja
Costa e Lima, Mariana Dehon
Reitz, Rodolfo Wilvert
ADVANCES IN SOFT COMPUTING, MICAI 2016, PT II, 2017, 10062 : 65 - 80
[4] Temporal and Agent Abstractions in Multiagent Reinforcement Learning
Clement, Danielle M.
Huber, Manfred
2016 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2016, : 2190 - 2195
[5] Distributed Optimal Tracking Control of Discrete-Time Multiagent Systems via Event-Triggered Reinforcement Learning
Peng, Zhinan
Luo, Rui
Hu, Jiangping
Shi, Kaibo
Ghosh, Bijoy Kumar
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I-REGULAR PAPERS, 2022, 69 (09) : 3689 - 3700
[6] Optimal Group Consensus of Multiagent Systems in Graphical Games Using Reinforcement Learning
Wang, Yuhan
Wang, Zhuping
Zhang, Hao
Yan, Huaicheng
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2025, 55 (03): : 2343 - 2353
[7] Adaptive Learning: A New Decentralized Reinforcement Learning Approach for Cooperative Multiagent Systems
Li, Meng-Lin
Chen, Shaofei
Chen, Jing
IEEE ACCESS, 2020, 8 : 99404 - 99421
[8] Opportunities for multiagent systems and multiagent reinforcement learning in traffic control
Bazzan, Ana L. C.
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2009, 18 (03) : 342 - 375
[9] Opportunities for multiagent systems and multiagent reinforcement learning in traffic control
Ana L. C. Bazzan
Autonomous Agents and Multi-Agent Systems, 2009, 18 : 342 - 375
[10] Constrained Multiagent Reinforcement Learning for Large Agent Population
Ling, Jiajing
Singh, Arambam James
Thien, Nguyen Duc
Kumar, Akshat
MACHINE LEARNING AND KNOWLEDGE DISCOVERY IN DATABASES, ECML PKDD 2022, PT IV, 2023, 13716 : 183 - 199

← 1 2 3 4 5 →