Rapid behavior learning in multi-agent environment based on state value estimation of others

被引:0
|
作者
Takahashi, Yasutake [1 ]
Noma, Kentaro [1 ]
Asada, Minoru [1 ]
机构
[1] Osaka Univ, Dept Adapt Machine Syst, Suita, Osaka 5650871, Japan
来源
2007 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS, VOLS 1-9 | 2007年
关键词
D O I
10.1109/IROS.2007.4399294
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competitions since other agents and their behaviors easily cause state and action space explosion. This paper presents a method of modular learning in a multiagent environment by which the learning agent can acquire cooperative behaviors with its team mates and competitive ones against its opponents. The key ideas to resolve the issue are as follows. First, a two-layer hierarchical system with multi learning modules is adopted to reduce the size of the sensor and action spaces. The state space of the top layer consists of the state values from the lower level, and the macro actions are used to reduce the size of the physical action space. Second, the state of the other to what extent it is close to its own goal is estimated by observation and used as a state value in the top layer state space to realize the cooperative/competitive behaviors. The method is applied to 4 (defense team) on 5 (offense team) game task, and the learning agent successfully acquired the teamwork plays (pass and shoot) within much shorter learning time (30 times quicker than the earlier work).
引用
收藏
页码:76 / 81
页数:6
相关论文
共 50 条
  • [41] Decentralized Exploration of a Structured Environment Based on Multi-agent Deep Reinforcement Learning
    He, Dingjie
    Feng, Dawei
    Jia, Hongda
    Liu, Hui
    2020 IEEE 26TH INTERNATIONAL CONFERENCE ON PARALLEL AND DISTRIBUTED SYSTEMS (ICPADS), 2020, : 172 - 179
  • [42] CONSENSUS-BASED STATE ESTIMATION FOR MULTI-AGENT SYSTEMS WITH CONSTRAINT INFORMATION
    Hu, Chen
    Qin, Weiwei
    Li, Zhenhua
    He, Bing
    Liu, Gang
    KYBERNETIKA, 2017, 53 (03) : 545 - 561
  • [43] Pricing Cloud Resource based on Multi-Agent Reinforcement Learning in the Competing Environment
    Shi, Bing
    Yuan, Han
    Shi, Rongjian
    2018 IEEE INT CONF ON PARALLEL & DISTRIBUTED PROCESSING WITH APPLICATIONS, UBIQUITOUS COMPUTING & COMMUNICATIONS, BIG DATA & CLOUD COMPUTING, SOCIAL COMPUTING & NETWORKING, SUSTAINABLE COMPUTING & COMMUNICATIONS, 2018, : 462 - 468
  • [44] Multi-Agent Reinforcement Learning Approach Based on Reduced Value Function Approximations
    Abouheaf, Mohammed
    Gueaieb, Wail
    2017 IEEE 5TH INTERNATIONAL SYMPOSIUM ON ROBOTICS AND INTELLIGENT SENSORS (IRIS), 2017, : 111 - 116
  • [45] Augmented Sensing-Based State Estimation for Cooperative Multi-Agent Systems
    Kwon, Cheolhyeon
    Kun, David
    Hwang, Inseok
    2015 AMERICAN CONTROL CONFERENCE (ACC), 2015, : 3792 - 3797
  • [46] Greedy based Value Representation for Optimal Coordination in Multi-agent Reinforcement Learning
    Wan, Lipeng
    Liu, Zeyang
    Chen, Xingyu
    Lan, Xuguang
    Zheng, Nanning
    INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 162, 2022,
  • [47] Ubiquitous Learning Environment: Smart Learning Platform with Multi-Agent Architecture
    Punnarumol Temdee
    Wireless Personal Communications, 2014, 76 : 627 - 641
  • [49] ADAPTIVE STATE REPRESENTATIONS FOR MULTI-AGENT REINFORCEMENT LEARNING
    De Hauwere, Yann-Michael
    Vrancx, Peter
    Nowe, Ann
    ICAART 2011: PROCEEDINGS OF THE 3RD INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 2, 2011, : 181 - 189
  • [50] Multi-agent learning
    Eduardo Alonso
    Autonomous Agents and Multi-Agent Systems, 2007, 15 : 3 - 4