Planning using hierarchical constrained Markov decision processes

被引:7
作者
Feyzabadi, Seyedshams [1 ]
Carpin, Stefano [1 ]
机构
[1] Univ Calif, Sch Engn, 5200 North Lake Rd, Merced, CA 95343 USA
关键词
Constrained Markov decision processes; Planning; Uncertainty;
D O I
10.1007/s10514-017-9630-4
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Constrained Markov decision processes offer a principled method to determine policies for sequential stochastic decision problems where multiple costs are concurrently considered. Although they could be very valuable in numerous robotic applications, to date their use has been quite limited. Among the reasons for their limited adoption is their computational complexity, since policy computation requires the solution of constrained linear programs with an extremely large number of variables. To overcome this limitation, we propose a hierarchical method to solve large problem instances. States are clustered into macro states and the parameters defining the dynamic behavior and the costs of the clustered model are determined using a Monte Carlo approach. We show that the algorithm we propose to create clustered states maintains valuable properties of the original model, like the existence of a solution for the problem. Our algorithm is validated in various planning problems in simulation and on a mobile robot platform, and we experimentally show that the clustered approach significantly outperforms the non-hierarchical solution while experiencing only moderate losses in terms of objective functions.
引用
收藏
页码:1589 / 1607
页数:19
相关论文
共 30 条
[1]  
[Anonymous], 2006, Intelligent robotics and autonomous agents
[2]  
[Anonymous], 2006, Planning algorithms
[3]  
[Anonymous], 1999, STOCH MODEL SER, DOI 10.1201/9781315140223
[4]  
[Anonymous], 2016, DYNAMIC PROGRAMMING
[5]  
Bai A., 2012, Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems, V3, P1215
[6]  
Barry J., 2010, TECHNICAL REPORT
[7]  
Barry J. L., 2011, INT JOINT C ART INT
[8]  
Bertsekas D., 2005, DYNAMIC PROGRAMMING, VII
[9]  
Bouvrie J, 2012, ANN ALLERTON CONF, P474, DOI 10.1109/Allerton.2012.6483256
[10]  
Carpin S, 2014, IEEE INT C INT ROBOT, P1147, DOI 10.1109/IROS.2014.6942702