On Canonical Forms for Zero-Sum Stochastic Mean Payoff Games

被引:7
作者
Boros, Endre [1 ]
Elbassioni, Khaled [2 ]
Gurvich, Vladimir [1 ]
Makino, Kazuhisa [3 ]
机构
[1] Rutgers State Univ, RUTCOR, Piscataway, NJ 08854 USA
[2] Masdar Inst Sci & Technol, Abu Dhabi, U Arab Emirates
[3] Univ Tokyo, Grad Sch Informat Sci & Technol, Tokyo 1138656, Japan
关键词
Stochastic games; Zero-sum; Saddle point; Equilibrium; Mean-payoff games; PERFECT INFORMATION; PARITY GAMES; GRAPHS; COMPLEXITY; TRANSITIONS; ALGORITHM; PROPERTY; PLAYER;
D O I
10.1007/s13235-013-0075-x
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
We consider two-person zero-sum mean payoff undiscounted stochastic games and obtain sufficient conditions for the existence of a saddle point in uniformly optimal stationary strategies. Namely, these conditions enable us to bring the game, by applying potential transformations, to a canonical form in which locally optimal strategies are globally optimal, and hence the value for every initial position and the optimal strategies of both players can be obtained by playing the local game at each state. We show that these conditions hold for the class of additive transition (AT) games, that is, the special case when the transitions at each state can be decomposed into two parts, each controlled completely by one of the two players. An important special case of AT-games form the so-called BWR-games which are played by two players on a directed graph with positions of three types: Black, White and Random. We give an independent proof for the existence of a canonical form in such games, and use this result to derive the existence of a canonical form (and hence, of a saddle point in uniformly optimal stationary strategies) in a wide class of games, which includes stochastic games with perfect information (PI), switching controller (SC) games and additive rewards, additive transition (ARAT) games. Unlike the proof for AT-games, our proof for the BWR-case does not rely on the existence of a saddle point in stationary strategies. We also derive some algorithmic consequences from these our reductions to BWR-games, in terms of solving PI-, and ARAT-games in sub-exponential time.
引用
收藏
页码:128 / 161
页数:34
相关论文
共 47 条
[1]  
Andersson D, 2009, LECT NOTES COMPUT SC, V5878, P112, DOI 10.1007/978-3-642-10631-6_13
[2]  
[Anonymous], 2004, P 15 ANN ACM SIAM S
[3]  
[Anonymous], 1960, DYNAMIC PROGRAMMING
[4]  
[Anonymous], J LONDON MATH SOC
[5]  
Beffara E., 2001, 2001020 UPPS U DEP I
[6]  
BEFFARA E, 2001, 2001025 UPPS U DEP I
[7]  
Bewley T., 1978, Mathematics of Operations Research, V3, P104, DOI 10.1287/moor.3.2.104
[8]   DISCRETE DYNAMIC-PROGRAMMING [J].
BLACKWELL, D .
ANNALS OF MATHEMATICAL STATISTICS, 1962, 33 (02) :719-&
[9]  
Boros E, 2012, RRR152012 RUTCOR RUT
[10]  
Boros E, 2012, RRR242012 RUTCOR RUT