Nonapproximability results for partially observable Markov decision processes

被引：48

作者：

Lusena, C ^{[1
]}

Goldsmith, J

Mundhenk, M

机构：

[1] Univ Kentucky, Dept Comp Sci, Lexington, KY 40506 USA

[2] Univ Trier, FB Informat 4, D-54286 Trier, Germany

来源：

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH | 2001年 / 14卷

基金：

美国国家科学基金会;

关键词：

D O I：

10.1613/jair.714

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

We show that for several variations of partially observable Markov decision processes, polynomial-time algorithms for finding control policies are unlikely to or simply don't have guarantees of finding policies within a constant factor or a constant summand of optimal. Here "unlikely" means "unless some complexity classes collapse," where the collapses considered are P = NP, P = PSPACE, or P = EXP. Until or unless these collapses are shown to hold, any control-policy designer must choose between such performance guarantees and efficient computation.

引用

页码：83 / 113

页数：31

共 38 条

[1]

[Anonymous], ANIMALS ANIMATS

[2]

BELLMAN R, 1957, DNAMIC PROGRAMMING

[3]

Blythe J., 1999, AI Magazine, V20, P37

[4]

Boutilier C, 1999, J ARTIF INTELL RES, V11, P1

[5]

BOUTLIER C, 1995, EXTENDING THEORIES A, P33

[6] On the complexity of partially observed Markov decision processes [J].

Burago, D ;

deRougemont, M ;

Slissenko, A .

THEORETICAL COMPUTER SCIENCE, 1996, 157 (02) :161-183

[7]

CASSANDRA A, 1995, CS9519 BROWN U

[8]

Cassandra A. R., 1998, Exact and approximate algorithms for partially observable Markov decision processes

[9]

Greenlaw R., 1995, LIMITS PARALLEL COMP, DOI [10.1093/oso/9780195085914.001.0001, DOI 10.1093/OSO/9780195085914.001.0001]

[10]

HANSEN E, 1998, UNPUB THESIS U MASSA

← 1 2 3 4 →