Simulation of multi-robot reinforcement learning for box-pushing problem

被引:17
作者
Kovac, K [1 ]
Zivkovic, I [1 ]
Basic, BD [1 ]
机构
[1] Univ Zagreb, Fac Elect & Comp Engn, Dept Elect Microelect Comp & Intelligent Syst, Zagreb 41000, Croatia
来源
MELECON 2004: PROCEEDINGS OF THE 12TH IEEE MEDITERRANEAN ELECTROTECHNICAL CONFERENCE, VOLS 1-3 | 2004年
关键词
D O I
10.1109/MELCON.2004.1347002
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The box-pushing problem represents a challenging domain for the study of object manipulation in a multi-robot environment. Our box-pushing problem is based on the pusher-watcher approach, involving two pushers robots that learn the best strategy for cooperatively moving an oversized elongated box to a specified goal and one watcher robot acting as the environment. This paper presents a solution to the box-pushing problem based on reinforcement learning in a multi-agent system. Within the framework of the paper, a simulator has been developed to carry out practical tests.
引用
收藏
页码:603 / 606
页数:4
相关论文
共 9 条
[1]  
[Anonymous], MULTI AGENT SYSTEMS
[2]  
Gerkey BP, 2002, 2002 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION, VOLS I-IV, PROCEEDINGS, P464, DOI 10.1109/ROBOT.2002.1013403
[3]   AUTOMATIC PROGRAMMING OF BEHAVIOR-BASED ROBOTS USING REINFORCEMENT LEARNING [J].
MAHADEVAN, S ;
CONNELL, J .
ARTIFICIAL INTELLIGENCE, 1992, 55 (2-3) :311-365
[4]  
MATARIC MJ, 1995, IROS '95 - 1995 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS: HUMAN ROBOT INTERACTION AND COOPERATIVE ROBOTS, PROCEEDINGS, VOL 3, P556, DOI 10.1109/IROS.1995.525940
[5]   Reinforcement learning in the multi-robot domain [J].
Mataric, MJ .
AUTONOMOUS ROBOTS, 1997, 4 (01) :73-83
[6]   Using communication to reduce locality in distributed multiagent learning [J].
Mataric, MJ .
JOURNAL OF EXPERIMENTAL & THEORETICAL ARTIFICIAL INTELLIGENCE, 1998, 10 (03) :357-369
[7]  
SIMSARIAN KT, 1995, P 3 EUR WORKSH LEARN
[8]  
Stone P., 2000, INTEL ROB AUTON AGEN
[9]  
Sutton R. S., 1998, Reinforcement Learning: An Introduction, V22447