Learning of Behavior Trees for Autonomous Agents

被引:55
作者
Colledanchise, Michele [1 ]
Parasuraman, Ramviyas [2 ]
Ogren, Petter [3 ]
机构
[1] Ist Italiano Tecnol, iCub Facil, I-16163 Genoa, Italy
[2] Purdue Univ, SMART Lab, W Lafayette, IN 47907 USA
[3] Royal Inst Technol KTH, S-11428 Stockholm, Sweden
关键词
Manuscript received April 10; 2017; revised September 11; accepted February 2; 2018. Date of publication March 19; 2018; date of current version June 14; 2019. This work was supported by the Swedish Foundation for Strategic Research through the Swedish Maritime Robotics Center. (Corresponding author: Michele Colledanchise.) M. Colledanchise is with the iCub Facility; Istituto Italiano di Tecnologia; 16163; Genova; Italy; (e-mail:; michele.colledanchise@iit.it). R. Parasuraman is with the SMART Lab; Purdue University; West Lafayette; IN; 47907; USA; ramviyas@purdue.edu). P. Ögren is with the Royal Institute of Technology (KTH); Stockholm; 11428; Sweden; petter@kth.se). Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identifier 10.1109/TG.2018.2816806 Fig. 1. Mario AI benchmark. The agent can observe only a small subset of the environment; namely the receptive field. (a) simple Scenario. (b) Complex scenario;
D O I
10.1109/TG.2018.2816806
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this paper, we study the problem of automatically synthesizing a successful behavior tree (BT) in an a priori unknown dynamic environment. Starting with a given set of actions, a reward function, and sensing in terms of a set of binary conditions, the proposed algorithm incrementally learns a switching structure, in terms of a BT, that is able to handle the situation encountered. Exploiting the fact that BTs generalize AND-OR -trees and also provide very natural chromosome mappings for genetic programming. we combine the long-term performance of genetic programming with a greedy element and use the AND-OR analogy to limit the size of the resulting structure. Finally, earlier results on BTs enable us to provide certain safety guarantees for the resulting system. Using the testing environment Mario AI, we compare our approach to alternative methods for learning BTs and finite state machines. The evaluation shows that the proposed approach generated solutions with better performance, and often fewer nodes than the other two methods.
引用
收藏
页码:183 / 189
页数:7
相关论文
共 25 条
[21]  
Oates T, 1996, PROCEEDINGS OF THE THIRTEENTH NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND THE EIGHTH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE, VOLS 1 AND 2, P863
[22]  
Ogren P., 2012, AIAA GUIDANCE NAVIGA, P4458
[23]  
Perez D, 2011, LECT NOTES COMPUT SC, V6624, P123, DOI 10.1007/978-3-642-20525-5_13
[24]   CONVERGENCE ANALYSIS OF CANONICAL GENETIC ALGORITHMS [J].
RUDOLPH, G .
IEEE TRANSACTIONS ON NEURAL NETWORKS, 1994, 5 (01) :96-101
[25]   Behavior Trees for Evolutionary Robotics [J].
Scheper, Kirk Y. W. ;
Tijmons, Sjoerd ;
de Visser, Cornelis C. ;
de Croon, Guido C. H. E. .
ARTIFICIAL LIFE, 2016, 22 (01) :23-48