Reinforcement distribution in a team of cooperative Q-learning agents

被引:4
|
作者
Abbasi, Zahra [1 ]
Abbasi, Mohammad Ali [2 ]
机构
[1] Islamic Azad Univ, Parand Branch, Tehran, Iran
[2] Univ Tehran, Fac Engn, Dept Elect & Comp Engn, Tehran 14174, Iran
关键词
agent learning; evolution; and adaptation; multiagent systems; cooperative distributed problem solving; coordination; cooperation; and teamwork; multiagent learning;
D O I
10.1109/SNPD.2008.154
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In a Q-learning multi-agent group, agents cooperate each other to perform their assigned task during their learning for increasing the team performance. If the role of each agent clearly specified-which is a very hard task for a supervisor agent- the team will learn more efficiently. Indeed, in this cage each agent reinforced according to its real effect on the team Performance. Assuming an identical role for all agents is the most prevalent technique of current researchers to escape the modeling complexities. But we believe this is not the optimum method for reinforcement distribution. The main goal of this research is to find an indirect evaluation method which evaluates the role of each agent in the team and distributes the reinforcement signal accordingly. The expertness of each agent is used as a criterion to estimate the effect of each agent's action on the team performance. Random and equal reinforcement signal distribution methods are also used in order to evaluate expertness-based reinforcement sharing. In addition, a new test bed, called EPIDEM, is developed to evaluate the proposed methods. The results show, the distribution of the reinforcement signals based on the proposed method improves the team learning speed.
引用
收藏
页码:154 / +
页数:3
相关论文
共 50 条
  • [1] Reinforcement distribution in fuzzy Q-learning
    Bonarini, Andrea
    Lazaric, Alessandro
    Montrone, Francesco
    Restelli, Marcello
    FUZZY SETS AND SYSTEMS, 2009, 160 (10) : 1420 - 1443
  • [2] A study on expertise of agents and its effects on cooperative Q-learning
    Araabi, Babak Nadjar
    Mastoureshgh, Sahar
    Ahmadabadi, Majid Nili
    IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART B-CYBERNETICS, 2007, 37 (02): : 398 - 409
  • [3] Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
    Tan, Fuxiao
    Yan, Pengfei
    Guan, Xinping
    NEURAL INFORMATION PROCESSING (ICONIP 2017), PT IV, 2017, 10637 : 475 - 483
  • [4] Communication-Less Cooperative Q-Learning Agents in Maze Problem
    Uwano, Fumito
    Takadama, Keiki
    INTELLIGENT AND EVOLUTIONARY SYSTEMS, IES 2016, 2017, 8 : 453 - 467
  • [5] An extension of weighted strategy sharing in cooperative Q-learning for specialized agents
    Eshgh, SM
    Ahmadabadi, MN
    ICONIP'02: PROCEEDINGS OF THE 9TH INTERNATIONAL CONFERENCE ON NEURAL INFORMATION PROCESSING: COMPUTATIONAL INTELLIGENCE FOR THE E-AGE, 2002, : 106 - 110
  • [6] Enhancing Nash Q-learning and Team Q-learning mechanisms by using bottlenecks
    Ghazanfari, Behzad
    Mozayani, Nasser
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2014, 26 (06) : 2771 - 2783
  • [7] Logical Team Q-learning: An approach towards factored policies in cooperative MARL
    Cassano, Lucas
    Sayed, Ali H.
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130 : 667 - +
  • [8] Time Horizon Generalization in Reinforcement Learning: Generalizing Multiple Q-Tables in Q-Learning Agents
    Hatcho, Yasuyo
    Hattori, Kiyohiko
    Takadama, Keiki
    JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS, 2009, 13 (06) : 667 - 674
  • [9] Simulating SQL injection vulnerability exploitation using Q-learning reinforcement learning agents
    Erdodi, Laszlo
    Sommervoll, Avald Aslaugson
    Zennaro, Fabio Massimo
    JOURNAL OF INFORMATION SECURITY AND APPLICATIONS, 2021, 61
  • [10] Fuzzy Q-Learning for generalization of reinforcement learning
    Berenji, HR
    FUZZ-IEEE '96 - PROCEEDINGS OF THE FIFTH IEEE INTERNATIONAL CONFERENCE ON FUZZY SYSTEMS, VOLS 1-3, 1996, : 2208 - 2214