Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs

被引:0
|
作者
Duc Thien Nguyen [1 ]
Yeoh, William [2 ]
Lau, Hoong Chuin [1 ]
Zilberstein, Shlomo [3 ]
Zhang, Chongjie [4 ]
机构
[1] Singapore Management Univ, Sch Informat Syst, Singapore, Singapore
[2] New Mexico State Univ, Dept Comp Sci, Las Cruces, NM 88003 USA
[3] Univ Massachusetts, Sch Comp Sci, Amherst, MA 01003 USA
[4] MIT, Comp Sci & Artificial Intelligence Lab, Cambridge, MA 02139 USA
来源
PROCEEDINGS OF THE TWENTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE | 2014年
基金
新加坡国家研究基金会; 美国国家科学基金会;
关键词
ALGORITHMS;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Researchers have introduced the Dynamic Distributed Constraint Optimization Problem (Dynamic DCOP) formulation to model dynamically changing multi-agent coordination problems, where a dynamic DCOP is a sequence of (static canonical) DCOPs, each partially different from the DCOP preceding it. Existing work typically assumes that the problem in each time step is decoupled from the problems in other time steps, which might not hold in some applications. Therefore, in this paper, we make the following contributions: (i) We introduce a new model, called Markovian Dynamic DCOPs (MD-DCOPs), where the DCOP in the next time step is a function of the value assignments in the current time step; (ii) We introduce two distributed reinforcement learning algorithms, the Distributed RVI Q-learning algorithm and the Distributed R-learning algorithm, that balance exploration and exploitation to solve MD-DCOPs in an online manner; and (iii) We empirically evaluate them against an existing multi arm bandit DCOP algorithm on dynamic DCOPs.
引用
收藏
页码:1447 / 1455
页数:9
相关论文
共 50 条
  • [1] Decentralized Multi-Agent Reinforcement Learning in Average-Reward Dynamic DCOPs
    Duc Thien Nguyen
    Yeoh, William
    Hoong Chuin Lau
    Zilberstein, Shlomo
    Zhang, Chongjie
    AAMAS'14: PROCEEDINGS OF THE 2014 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2014, : 1341 - 1342
  • [2] Convergence Rates of Average-Reward Multi-agent Reinforcement Learning via Randomized Linear Programming
    Koppel, Alec
    Bedi, Amrit Singh
    Ganguly, Bhargav
    Aggarwal, Vaneet
    2022 IEEE 61ST CONFERENCE ON DECISION AND CONTROL (CDC), 2022, : 4545 - 4552
  • [3] Robust Average-Reward Reinforcement Learning
    Wang, Yue
    Velasquez, Alvaro
    Atia, George
    Prater-Bennette, Ashley
    Zou, Shaofeng
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2024, 80 : 719 - 803
  • [4] Robust Average-Reward Reinforcement Learning
    Wang, Yue
    Velasquez, Alvaro
    Atia, George
    Prater-Bennette, Ashley
    Zou, Shaofeng
    Journal of Artificial Intelligence Research, 2024, 80 : 719 - 803
  • [5] Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
    Qu, Guannan
    Lin, Yiheng
    Wierman, Adam
    Li, Na
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 33, NEURIPS 2020, 2020, 33
  • [6] Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning
    El Mhamdi, El Mandi
    Guerraoui, Rachid
    Hendrikx, Hadrien
    Maurer, Alexandre
    ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS 30 (NIPS 2017), 2017, 30
  • [7] Average-Reward Reinforcement Learning with Trust Region Methods
    Ma, Xiaoteng
    Tang, Xiaohang
    Xia, Li
    Yang, Jun
    Zhao, Qianchuan
    PROCEEDINGS OF THE THIRTIETH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2021, 2021, : 2797 - 2803
  • [8] Tuning Local Search by Average-Reward Reinforcement Learning
    Prestwich, Steven
    LEARNING AND INTELLIGENT OPTIMIZATION, 2008, 5313 : 192 - 205
  • [9] Multi-Agent Reinforcement Learning with Reward Delays
    Zhang, Yuyang
    Zhang, Runyu
    Gu, Yuantao
    Li, Na
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [10] Decentralized Deterministic Multi-Agent Reinforcement Learning
    Grosnit, Antoine
    Cai, Desmond
    Wynter, Laura
    2021 60TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2021, : 1548 - 1553