Solving transition independent decentralized Markov decision processes

被引:102
作者
Becker, R [1 ]
Zilberstein, S [1 ]
Lesser, V [1 ]
Goldman, CV [1 ]
机构
[1] Univ Massachusetts, Dept Comp Sci, Amherst, MA 01003 USA
关键词
D O I
10.1613/jair.1497
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Formal treatment of collaborative multi-agent systems has been lagging behind the rapid progress in sequential decision making by individual agents. Recent work in the area of decentralized Markov Decision Processes (MDPs) has contributed to closing this gap, but the computational complexity of these models remains a serious obstacle. To overcome this complexity barrier, we identify a specific class of decentralized MDPs in which the agents' transitions are independent. The class consists of independent collaborating agents that are tied together through a structured global reward function that depends on all of their histories of states and actions. We present a novel algorithm for solving this class of problems and examine its properties, both as an optimal algorithm and as an anytime algorithm. To the best of our knowledge, this is the first algorithm to optimally solve a non-trivial subclass of decentralized MDPs. It lays the foundation for further work in this area on both exact and approximate algorithms.
引用
收藏
页码:423 / 455
页数:33
相关论文
共 31 条
  • [21] INTRACTABLE PROBLEMS IN CONTROL-THEORY
    PAPADIMITRIOU, CH
    TSITSIKLIS, J
    [J]. SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 1986, 24 (04) : 639 - 654
  • [22] ON THE COMPLEXITY OF DESIGNING DISTRIBUTED PROTOCOLS
    PAPADIMITRIOU, CH
    TSITSIKLIS, J
    [J]. INFORMATION AND CONTROL, 1982, 53 (03): : 211 - 218
  • [23] Peshkin Leonid, 2000, P 16 C UNCERTAINTY A
  • [24] The communicative multiagent team decision problem: Analyzing teamwork theories and models
    Pynadath, DV
    Tambe, M
    [J]. JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2002, 16 : 389 - 423
  • [25] Shen J., 2003, P 2 INT JOINT C AUT, P678
  • [26] TUMER K, 2002, P 1 INT JOINT C AUT
  • [27] WASHINGTON R, 1999, P IEEE AER C
  • [28] Wolpert D. H., 1999, Proceedings of the Third International Conference on Autonomous Agents, P77, DOI 10.1145/301136.301167
  • [29] XUAN P, 2002, P 1 INT JOINT C AUT, P1098
  • [30] YOKOO M, 1991, CSETR10191 U MICH