A Study on Cooperative Action Selection Considering Unfairness in Decentralized Multiagent Reinforcement Learning

被引：0

作者：

Matsui, Toshihiro ^{[1
]}

Matsuo, Hiroshi ^{[1
]}

机构：

[1] Nagoya Inst Technol, Showa Ku, Gokisyo Cho, Nagoya, Aichi 4668555, Japan

来源：

ICAART: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1 | 2017年

关键词：

Multiagent System; Reinforcement Learning; Distributed Constraint Optimization; Unfairness; Leximin;

D O I：

10.5220/0006203800880095

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement learning has been studied for cooperative learning and optimization methods in multiagent systems. In several frameworks of multiagent reinforcement learning, the system's whole problem is decomposed into local problems for agents. To choose an appropriate cooperative action, the agents perform an optimization method that can be performed in a distributed manner. While the conventional goal of the learning is the maximization of the total rewards among agents, in practical resource allocation problems, unfairness among agents is critical. In several recent studies of decentralized optimization methods, unfairness was considered a criterion. We address an action selection method based on leximin criteria, which reduces the unfairness among agents, in decentralized reinforcement learning. We experimentally evaluated the effects and influences of the proposed approach on classes of sensor network problems.

引用

页码：88 / 95

页数：8

共 14 条

[1] [Anonymous], 2004, Journal of machine learning research, DOI DOI 10.1162/1532443041827880
[2] Computing leximin-optimal solutions in constraint networks
Bouveret, Sylvain
Lemaitre, Michel
[J]. ARTIFICIAL INTELLIGENCE, 2009, 173 (02) : 343 - 364
[3] Nguyen DT, 2014, AAAI CONF ARTIF INTE, P1447
[4] Farinelli A., 2008, 7 INT C AUT AG MULT, P639
[5] Matsui Toshihiro, 2014, 6th International Conference on Agents and Artificial Intelligence (ICAART 2014). Proceedings, P184
[6] Leximin Asymmetric Multiple Objective DCOP on Factor Graph
Matsui, Toshihiro
Silaghi, Marius
Okimoto, Tenda
Hirayama, Katsutoshi
Yokoo, Makoto
Matsuo, Hiroshi
[J]. PRIMA 2015: PRINCIPLES AND PRACTICE OF MULTI-AGENT SYSTEMS, 2015, 9387 : 134 - 151
[7] Matsui T, 2014, LECT NOTES ARTIF INT, V8861, P423, DOI 10.1007/978-3-319-13191-7_34
[8] Adopt: asynchronous distributed constraint optimization with quality guarantees
Modi, PJ
Shen, WM
Tambe, M
Yokoo, M
[J]. ARTIFICIAL INTELLIGENCE, 2005, 161 (1-2) : 149 - 180
[9] Moulin H., 1988, Axioms of Cooperative Decision Making
[10] Netzer Arnon, 2013, Proceedings of the 5th International Conference on Agents and Artificial Intelligence. ICAART 2013, P15

← 1 2 →