Nonapproximability results for partially observable Markov decision processes

被引:48
作者
Lusena, C [1 ]
Goldsmith, J
Mundhenk, M
机构
[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA
[2] Univ Trier, FB Informat 4, D-54286 Trier, Germany
基金
美国国家科学基金会;
关键词
D O I
10.1613/jair.714
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant summand of optimal. Here "unlikely" means "unless some complexity classes collapse," where the collapses considered are P = NP, P = PSPACE, or P = EXP. Until or unless these collapses are shown to hold, any control-policy designer must choose between such performance guarantees and efficient computation.
引用
收藏
页码:83 / 113
页数:31
相关论文
共 38 条
[1]  
[Anonymous], ANIMALS ANIMATS
[2]  
BELLMAN R, 1957, DNAMIC PROGRAMMING
[3]  
Blythe J., 1999, AI Magazine, V20, P37
[4]  
Boutilier C, 1999, J ARTIF INTELL RES, V11, P1
[5]  
BOUTLIER C, 1995, EXTENDING THEORIES A, P33
[6]   On the complexity of partially observed Markov decision processes [J].
Burago, D ;
deRougemont, M ;
Slissenko, A .
THEORETICAL COMPUTER SCIENCE, 1996, 157 (02) :161-183
[7]  
CASSANDRA A, 1995, CS9519 BROWN U
[8]  
Cassandra A. R., 1998, Exact and approximate algorithms for partially observable Markov decision processes
[9]  
Greenlaw R., 1995, LIMITS PARALLEL COMP, DOI [10.1093/oso/9780195085914.001.0001, DOI 10.1093/OSO/9780195085914.001.0001]
[10]  
HANSEN E, 1998, UNPUB THESIS U MASSA