Optimal tracking agent: a new framework of reinforcement learning for multiagent systems

被引:3
|
作者
Cao, Weihua [1 ]
Chen, Gang [1 ]
Chen, Xin [1 ]
Wu, Min [1 ]
机构
[1] Cent South Univ, Inst Adv Control & Intelligent Automat, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China
基金
高等学校博士学科点专项科研基金;
关键词
estimator; action selection mechanism; curse of dimensionality; optimal tracking agent; multiagent systems;
D O I
10.1002/cpe.2870
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
SUMMARYThe curse of dimensionality is a ubiquitous problem for multiagent reinforcement learning, which means the learning and storing space grows exponentially with the number of agents and hinders the application of multiagent reinforcement learning. To relieve this problem, we propose a new framework named as optimal tracking agent (OTA). The OTA views the other agents as part of the environment and uses a reduced form to learn the optimal decision. Although merging other agents into the environment may reduce the dimension of action space, the environment characterized by such form is dynamic and does not satisfy the convergence of reinforcement learning (RL). Thus, we develop an estimator to track the dynamics of the environment. The estimator obtains the dynamic model, and then the model-based RL can be used to react to the dynamic environment optimally. Because the Q-function in OTA is also a dynamic process because of other agents' dynamics, different from traditional RL, in which the learning is a stationary process and the usual action selection mechanisms just suit to such stationary process, we improve the greedy action selection mechanism to adapt to such dynamics. Thus, the OTA will have convergence. An experiment illustrates the validity and efficiency of the OTA.Copyright (c) 2012 John Wiley & Sons, Ltd.
引用
收藏
页码:2002 / 2015
页数:14
相关论文
共 50 条
  • [21] Beyond Reinforcement Learning and Local View in Multiagent Systems
    Bazzan, Ana L. C.
    KUNSTLICHE INTELLIGENZ, 2014, 28 (03): : 179 - 189
  • [22] Optimal Tracking Control of Nonlinear Multiagent Systems Using Internal Reinforce Q-Learning
    Peng, Zhinan
    Luo, Rui
    Hu, Jiangping
    Shi, Kaibo
    Nguang, Sing Kiong
    Ghosh, Bijoy Kumar
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2022, 33 (08) : 4043 - 4055
  • [23] Prior Knowledge-Augmented Broad Reinforcement Learning Framework for Fault Diagnosis of Heterogeneous Multiagent Systems
    Guo, Li
    Ren, Yiran
    Li, Runze
    Jiang, Bin
    IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS, 2024, 16 (01) : 115 - 123
  • [24] Integral Reinforcement-Learning-Based Optimal Containment Control for Partially Unknown Nonlinear Multiagent Systems
    Wu, Qiuye
    Wu, Yongheng
    Wang, Yonghua
    ENTROPY, 2023, 25 (02)
  • [25] Optimal Synchronization Control of Multiagent Systems With Input Saturation via Off-Policy Reinforcement Learning
    Qin, Jiahu
    Li, Man
    Shi, Yang
    Ma, Qichao
    Zheng, Wei Xing
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2019, 30 (01) : 85 - 96
  • [26] Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
    Hao, Jianye
    Yang, Tianpei
    Tang, Hongyao
    Bai, Chenjia
    Liu, Jinyi
    Meng, Zhaopeng
    Liu, Peng
    Wang, Zhen
    IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, 2024, 35 (07) : 8762 - 8782
  • [27] Reinforcement learning intermittent optimal formation control for multi-agent systems with disturbances
    Liu, Erliang
    Miao, Guoying
    Hu, Jingyu
    MEASUREMENT SCIENCE AND TECHNOLOGY, 2024, 35 (12)
  • [28] Adaptive Fault-Tolerant Tracking Control for Discrete-Time Multiagent Systems via Reinforcement Learning Algorithm
    Li, Hongyi
    Wu, Ying
    Chen, Mou
    IEEE TRANSACTIONS ON CYBERNETICS, 2021, 51 (03) : 1163 - 1174
  • [29] Optimal iterative learning control design for multi-agent systems consensus tracking
    Yang, Shiping
    Xu, Jian-Xin
    Huang, Deqing
    Tan, Ying
    SYSTEMS & CONTROL LETTERS, 2014, 69 : 80 - 89
  • [30] Inverse Reinforcement Learning for Decentralized Non-Cooperative Multiagent Systems
    Reddy, Tummalapalli Sudhamsh
    Gopikrishna, Vamsikrishna
    Zaruba, Gergely
    Huber, Manfred
    PROCEEDINGS 2012 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2012, : 1930 - 1935