Comparative criteria for partially observable contingent planning

被引：0

作者：

Dorin Shmaryahu

Guy Shani

Jörg Hoffmann

机构：

[1] Ben Gurion University of the Negev,

[2] Saarland University,undefined

来源：

Autonomous Agents and Multi-Agent Systems | 2019年 / 33卷

关键词：

Planning; Contingent planning; Comparative Criteria; Plan tree; Partial observability;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In contingent planning under partial observability with sensing actions, agents actively use sensing to discover meaningful facts about the world. The solution can be represented as a plan tree or graph, branching on various possible observations. Typically in contingent planning one seeks a satisfying plan leading to a goal state at each leaf. In many applications, however, one may prefer some satisfying plans to others, such as plans that lead to the goal with a lower average cost. However, methods such as average cost make an implicit assumption concerning the probabilities of outcomes, which may not apply when the stochastic dynamics of the environment are unknown. We focus on the problem of providing valid comparative criteria for contingent plan trees and graphs, allowing us to compare two plans and decide which one is preferable. We suggest a set of such comparison criteria—plan simplicity, dominance, and best and worst plan costs.We also argue that in some cases certain branches of the plan correspond to an unlikely combination of mishaps, and can be ignored, and provide methods for pruning such unlikely branches before comparing the plan graphs. We explain these criteria, and discuss their validity, correlations, and application to real world problems. We also suggest efficient algorithms for computing the comparative criteria where needed. We provide experimental results, showing that existing contingent planners provide diverse plans, that can be compared using these criteria.

引用

页码：481 / 517

页数：36

共 46 条

[11]

Garbarino EC(1996)Average reward reinforcement learning: Foundations, algorithms, and empirical results Machine learning 22 159-195

[12]

Edell JA(2009)Robust navigation in an unknown environment with minimal sensing and representation IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 39 212-229

[13]

Helmert M(2008)Comparing the power of robots The International Journal of Robotics Research 27 5-23

[14]

Hoffmann J(2005)An MDP-based recommender system Journal of Machine Learning Research 6 1265-1295

[15]

Nebel B(2013)A survey of point-based POMDP solvers Autonomous Agents and Multi-Agent Systems 27 1-51

[16]

Kupcsik A(2014)Deploying a modeling framework for reusable robot behavior to enable informed strategies for domestic service robots Robotics and Autonomous Systems 62 619-631

[17]

Deisenroth MP(2006)Branching and pruning: An optimal temporal pocl planner based on constraint programming Artificial Intelligence 170 298-335

[18]

Peters J(2010)Comparison of optimal solutions to real-time path planning for a mobile vehicle IEEE Transactions on Systems, Man, and Cybernetics-Part A: Systems and Humans 40 721-731

[19]

Loh AP(undefined)undefined undefined undefined undefined-undefined

[20]

Vadakkepat P(undefined)undefined undefined undefined undefined-undefined

← 1 2 3 4 5 →