A REINFORCEMENT LEARNING APPROACH FOR MULTIAGENT NAVIGATION

被引：0

作者：

Martinez-Gil, Francisco ^{[1
]}

Barber, Fernando ^{[1
]}

Lozano, Miguel ^{[1
]}

Grimaldo, Francisco ^{[1
]}

Fernandez, Fernando ^{[1
]}

机构：

[1] Univ Valencia, Dept Informat, Campus Burjassot, Valencia, Spain

来源：

ICAART 2010: PROCEEDINGS OF THE 2ND INTERNATIONAL CONFERENCE ON AGENTS AND ARTIFICIAL INTELLIGENCE, VOL 1: ARTIFICIAL INTELLIGENCE | 2010年

关键词：

Reinforcement learning; Multiagent systems; Local navigation;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

This paper presents a Q-Learning-based multiagent system oriented to provide navigation skills to simulation agents in virtual environments. We focus on learning local navigation behaviours from the interactions with other agents and the environment. We adopt an environment-independent state space representation to provide the required scalability of such kind of systems. In this way, we evaluate whether the learned action-value functions can be transferred to other agents to increase the size of the group without loosing behavioural quality. We explain the learning process defined and the the results of the collective behaviours obtained in a well-known experiment in multiagent navigation: the exit of a place through a door.

引用

页码：607 / 610

页数：4

共 9 条

[1] A reinforcement learning algorithm in cooperative multi-robot domains [J].

Fernández, F ;

Borrajo, D ;

Parker, LE .

JOURNAL OF INTELLIGENT & ROBOTIC SYSTEMS, 2005, 43 (2-4) :161-174

[2] Simulating dynamical features of escape panic [J].

Helbing, D ;

Farkas, I ;

Vicsek, T .

NATURE, 2000, 407 (6803) :487-490

[3]

Howard R. A., 1960, Dynamic programming and Markov processes

[4] Reinforcement learning: A survey [J].

Kaelbling, LP ;

Littman, ML ;

Moore, AW .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 1996, 4 :237-285

[5]

Lozano M., 2008, J NETWORKS COMP APP

[6]

Reynolds C.W., 1987, ACM ANN C COMP GRAPH, P25, DOI [10.1145/37402.37406, DOI 10.1145/37402.37406]

[7]

Stone P., 2005, ADAPTIVE BEHAV, V13

[8]

Taylor M. E., 2005, 4 IJC AUTONOMOUS AGE

[9]

WATKINS CJCH, 1992, MACH LEARN, V8, P279, DOI 10.1007/BF00992698

← 1 →