Dynamic control in real-time heuristic search

被引:17
作者
Bulitko, Vadim [1 ]
Lustrek, Mitja [2 ]
Schaeffer, Jonathan [1 ]
Bjornsson, Yngvi [3 ]
Sigmundarson, Sverrir [4 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB T6G 2E8, Canada
[2] Jozef Stefan Inst, Dept Intelligent Syst, Ljubljana 1000, Slovenia
[3] Reykjavik Univ, Sch Comp Sci, IS-103 Reykjavik, Iceland
[4] Landsbanki London Branch, London EC3A 7QR, England
关键词
Heuristic algorithms - Learning algorithms - Modular robots;
D O I
10.1613/jair.2497
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Real-time heuristic search is a challenging type of agent-centered search because the agent's planning time per action is bounded by a constant independent of problem size. A common problem that imposes such restrictions is pathfinding in modern computer games where a large number of units must plan their paths simultaneously over large maps. Common search algorithms ( e. g., A*, IDA*, D*, ARA*, AD*) are inherently not real-time and may lose completeness when a constant bound is imposed on per-action planning time. Real-time search algorithms retain completeness but frequently produce unacceptably suboptimal solutions. In this paper, we extend classic and modern real-time search algorithms with an automated mechanism for dynamic depth and subgoal selection. The new algorithms remain real-time and complete. On large computer game maps, they find paths within 7% of optimal while on average expanding roughly a single state per action. This is nearly a three-fold improvement in suboptimality over the existing state-of-the-art algorithms and, at the same time, a 15-fold improvement in the amount of planning per action.
引用
收藏
页码:419 / 452
页数:34
相关论文
共 55 条
[41]   PRIORITIZED SWEEPING - REINFORCEMENT LEARNING WITH LESS DATA AND LESS TIME [J].
MOORE, AW ;
ATKESON, CG .
MACHINE LEARNING, 1993, 13 (01) :103-130
[42]  
NASH A, 2007, P AAAI C ART INT, P1177
[43]  
Pearl J., 1984, HEURISTICS
[44]  
Rayner DC, 2007, 20TH INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, P2372
[45]  
Russell Stuart J., 1991, Do the Right Thing: Studies in Limited Rationality
[47]  
Schaid DJ, 2000, GENET EPIDEMIOL, V19, P30, DOI 10.1002/1098-2272(200007)19:1<30::AID-GEPI3>3.0.CO
[48]  
2-X
[49]   Controlling the learning process of real-time heuristic search [J].
Shimbo, M ;
Ishida, T .
ARTIFICIAL INTELLIGENCE, 2003, 146 (01) :1-41
[50]  
SIGMUNDARSON S, 2006, P NAT C ART INT AAAI, P136