The complexity of decentralized control of Markov decision processes

被引:551
作者
Bernstein, DS [1 ]
Givan, R
Immerman, N
Zilberstein, S
机构
[1] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA
[2] Purdue Univ, Sch Elect & Comp Engn, W Lafayette, IN 47907 USA
关键词
computational complexity; Markov decision process; decentralized control;
D O I
10.1287/moor.27.4.819.297
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalizations of both the fully observable case and the partially observable case that allow for decentralized control are described. For even two agents, the finite-horizon problems corresponding to both of these models are hard for nondeterministic exponential time. These complexity results illustrate a fundamental difference between centralized and decentralized control of Markov decision processes. In contrast to the problems involving centralized control, the problems we consider provably do not admit polynomial-time algorithms. Furthermore, assuming EXP not equal NEXP, the problems require superexponential time to solve in the worst case.
引用
收藏
页码:819 / 840
页数:22
相关论文
共 50 条
  • [41] Markov decision processes with quasi-hyperbolic discounting
    Anna Jaśkiewicz
    Andrzej S. Nowak
    Finance and Stochastics, 2021, 25 : 189 - 229
  • [42] Transfer-Entropy-Regularized Markov Decision Processes
    Tanaka, Takashi
    Sandberg, Henrik
    Skoglund, Mikael
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2022, 67 (04) : 1944 - 1951
  • [43] Lexicographic refinements in stationary possibilistic Markov Decision Processes
    Ben Amor, Nahla
    El Khalfi, Zeineb
    Fargier, Helene
    Sabbadin, Regis
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2018, 103 : 343 - 363
  • [44] Modelling the profitability of credit cards by Markov decision processes
    So, Meko M. C.
    Thomas, Lyn C.
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2011, 212 (01) : 123 - 130
  • [45] Evolutionary policy iteration for solving Markov decision processes
    Chang, HS
    Lee, HG
    Fu, MC
    Marcus, SI
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) : 1804 - 1808
  • [46] Markov decision processes with iterated coherent risk measures
    Chu, Shanyun
    Zhang, Yi
    INTERNATIONAL JOURNAL OF CONTROL, 2014, 87 (11) : 2286 - 2293
  • [47] Markov decision processes with quasi-hyperbolic discounting
    Jaskiewicz, Anna
    Nowak, Andrzej S.
    FINANCE AND STOCHASTICS, 2021, 25 (02) : 189 - 229
  • [48] Experimental Design for Partially Observed Markov Decision Processes
    Thorbergsson, Leifur
    Hooker, Giles
    SIAM-ASA JOURNAL ON UNCERTAINTY QUANTIFICATION, 2018, 6 (02): : 549 - 567
  • [49] Optimization of Markov decision processes under the variance criterion
    Xia, Li
    AUTOMATICA, 2016, 73 : 269 - 278
  • [50] Efficient Policies for Stationary Possibilistic Markov Decision Processes
    Ben Amor, Nahla
    El Khalfi, Zeineb
    Fargier, Helene
    Sabaddin, Regis
    SYMBOLIC AND QUANTITATIVE APPROACHES TO REASONING WITH UNCERTAINTY, ECSQARU 2017, 2017, 10369 : 306 - 317