A Study on Cooperative Action Selection Considering Unfairness in Decentralized Multiagent Reinforcement Learning

被引:0
作者
Matsui, Toshihiro [1 ]
Matsuo, Hiroshi [1 ]
机构
[1] Nagoya Inst Technol, Showa Ku, Gokisyo Cho, Nagoya, Aichi 4668555, Japan
来源
ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1 | 2017年
关键词
Multiagent System; Reinforcement Learning; Distributed Constraint Optimization; Unfairness; Leximin;
D O I
10.5220/0006203800880095
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Reinforcement learning has been studied for cooperative learning and optimization methods in multiagent systems. In several frameworks of multiagent reinforcement learning, the system's whole problem is decomposed into local problems for agents. To choose an appropriate cooperative action, the agents perform an optimization method that can be performed in a distributed manner. While the conventional goal of the learning is the maximization of the total rewards among agents, in practical resource allocation problems, unfairness among agents is critical. In several recent studies of decentralized optimization methods, unfairness was considered a criterion. We address an action selection method based on leximin criteria, which reduces the unfairness among agents, in decentralized reinforcement learning. We experimentally evaluated the effects and influences of the proposed approach on classes of sensor network problems.
引用
收藏
页码:88 / 95
页数:8
相关论文
共 14 条
  • [1] [Anonymous], 2004, Journal of machine learning research, DOI DOI 10.1162/1532443041827880
  • [2] Computing leximin-optimal solutions in constraint networks
    Bouveret, Sylvain
    Lemaitre, Michel
    [J]. ARTIFICIAL INTELLIGENCE, 2009, 173 (02) : 343 - 364
  • [3] Nguyen DT, 2014, AAAI CONF ARTIF INTE, P1447
  • [4] Farinelli A., 2008, 7 INT C AUT AG MULT, P639
  • [5] Matsui Toshihiro, 2014, 6th International Conference on Agents and Artificial Intelligence (ICAART 2014). Proceedings, P184
  • [6] Leximin Asymmetric Multiple Objective DCOP on Factor Graph
    Matsui, Toshihiro
    Silaghi, Marius
    Okimoto, Tenda
    Hirayama, Katsutoshi
    Yokoo, Makoto
    Matsuo, Hiroshi
    [J]. PRIMA 2015: PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2015, 9387 : 134 - 151
  • [7] Matsui T, 2014, LECT NOTES ARTIF INT, V8861, P423, DOI 10.1007/978-3-319-13191-7_34
  • [8] Adopt: asynchronous distributed constraint optimization with quality guarantees
    Modi, PJ
    Shen, WM
    Tambe, M
    Yokoo, M
    [J]. ARTIFICIAL INTELLIGENCE, 2005, 161 (1-2) : 149 - 180
  • [9] Moulin H., 1988, Axioms of Cooperative Decision Making
  • [10] Netzer Arnon, 2013, Proceedings of the 5th International Conference on Agents and Artificial Intelligence. ICAART 2013, P15