Rapid behavior learning in multi-agent environment based on state value estimation of others

被引:0
|
作者
Takahashi, Yasutake [1 ]
Noma, Kentaro [1 ]
Asada, Minoru [1 ]
机构
[1] Osaka Univ, Dept Adapt Machine Syst, Suita, Osaka 5650871, Japan
来源
2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9 | 2007年
关键词
D O I
10.1109/IROS.2007.4399294
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behaviors easily cause state and action space explosion. This paper presents a method of modular learning in a multiagent environment by which the learning agent can acquire cooperative behaviors with its team mates and competitive ones against its opponents. The key ideas to resolve the issue are as follows. First, a two-layer hierarchical system with multi learning modules is adopted to reduce the size of the sensor and action spaces. The state space of the top layer consists of the state values from the lower level, and the macro actions are used to reduce the size of the physical action space. Second, the state of the other to what extent it is close to its own goal is estimated by observation and used as a state value in the top layer state space to realize the cooperative/competitive behaviors. The method is applied to 4 (defense team) on 5 (offense team) game task, and the learning agent successfully acquired the teamwork plays (pass and shoot) within much shorter learning time (30 times quicker than the earlier work).
引用
收藏
页码:76 / 81
页数:6
相关论文
共 50 条
  • [1] Efficient Behavior Learning Based on State Value Estimation of Self and Others
    Takahashi, Yasutake
    Noma, Kentaro
    Asada, Minoru
    ADVANCED ROBOTICS, 2008, 22 (12) : 1379 - 1395
  • [2] Behavior modeling based on multi-agent and multi-agent simulation environment
    Yin, QJ
    Du, XY
    Huang, K
    SYSTEM SIMULATION AND SCIENTIFIC COMPUTING, VOLS 1 AND 2, PROCEEDINGS, 2005, : 1531 - 1536
  • [3] Better value estimation in Q-learning-based multi-agent reinforcement learning
    Ding, Ling
    Du, Wei
    Zhang, Jian
    Guo, Lili
    Zhang, Chenglong
    Jin, Di
    Ding, Shifei
    SOFT COMPUTING, 2024, 28 (06) : 5625 - 5638
  • [4] Better value estimation in Q-learning-based multi-agent reinforcement learning
    Ling Ding
    Wei Du
    Jian Zhang
    Lili Guo
    Chenglong Zhang
    Di Jin
    Shifei Ding
    Soft Computing, 2024, 28 : 5625 - 5638
  • [5] Behavior acquisition based on multi-module learning system in multi-agent environment
    Takahashi, Y
    Edazawa, K
    Asada, M
    ROBOCUP 2002: ROBOT SOCCER WORLD CUP VI, 2003, 2752 : 435 - 442
  • [6] Agent programmability in a multi-agent learning environment
    Cao, Y
    Greer, J
    ARTIFICIAL INTELLIGENCE IN EDUCATION: SHAPING THE FUTURE OF LEARNING THROUGH INTELLIGENT TECHNOLOGIES, 2003, 97 : 297 - 304
  • [7] Learning to Optimize State Estimation in Multi-Agent Reinforcement Learning-Based Collaborative Detection
    Zhou, Tianlong
    Shi, Tianyi
    Gao, Hongye
    Rao, Weixiong
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (12) : 14330 - 14343
  • [8] Multi-module learning system for behavior acquisition in multi-agent environment
    Takahashi, Y
    Edazawa, K
    Asada, M
    2002 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-3, PROCEEDINGS, 2002, : 927 - 931
  • [9] Correcting biased value estimation in mixing value-based multi-agent reinforcement learning by multiple choice learning
    Liu, Bing
    Xie, Yuxuan
    Feng, Lei
    Fu, Ping
    ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2022, 116
  • [10] Modular learning system and scheduling for behavior acquisition in multi-agent environment
    Takahashi, Y
    Edazawa, K
    Asada, M
    ROBOCUP 2004: ROBOT SOCCER WORLD CUP VIII, 2005, 3276 : 548 - 555