Contraction conditions for average and alpha-discount optimality in countable state Markov games with unbounded rewards

被引:41
作者
Altman, E [1 ]
Hordijk, A [1 ]
Spieksma, FM [1 ]
机构
[1] LEIDEN UNIV,DEPT MATH & COMP SCI,NL-2300 RA LEIDEN,NETHERLANDS
关键词
noncooperative Markov games; mu-geometric recurrence; equilibrium policies; value iteration; birth-death control;
D O I
10.1287/moor.22.3.588
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
The goal of this paper is to provide a theory of N-person Markov games with unbounded cost, for a countable state space and compact action spaces. We investigate both the finite and infinite horizon problems. For the latter, we consider the discounted cost as well as the expected average cost. We present conditions for the infinite horizon problems for which equilibrium policies exist for all players within the stationary policies, and show that the costs in equilibrium policies exist for all players within the stationary policies, and show that the costs in equilibrium satisfy the optimality equations. Similar results are obtained for the finite horizon costs, for which equilibrium policies are shown to exist for all players within the Markov policies. As special case of N-person games, we investigate the zero-sum (2 players) game, for which we establish the convergence of the value iteration algorithm. We conclude by studying an application of a zero-sum Markov game in a queueing model.
引用
收藏
页码:588 / 618
页数:31
相关论文
共 45 条
[1]   FLOW-CONTROL USING THE THEORY OF ZERO-SUM MARKOV GAMES [J].
ALTMAN, E .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1994, 39 (04) :814-818
[2]   Zero-sum Markov games and worst-case optimal control of queueing systems [J].
Altman, E ;
Hordijk, A .
QUEUEING SYSTEMS, 1995, 21 (3-4) :415-447
[3]  
ALTMAN E, 1996, P 7 INT S DYN GAM AP
[4]  
ALTMAN E, 1995, 2574 INRIA
[5]  
[Anonymous], MATH CTR TRACT
[6]  
[Anonymous], 8750 CORE U CATH LOU
[7]  
Bewley T., 1976, Mathematics of Operations Research, V1, P197, DOI 10.1287/moor.1.3.197
[8]  
BORKAR VS, 1993, J OPTIM THEORY APPL, P539
[9]   RECENT RESULTS ON CONDITIONS FOR THE EXISTENCE OF AVERAGE OPTIMAL STATIONARY POLICIES [J].
Cavazos-Cadena, Rolando .
ANNALS OF OPERATIONS RESEARCH, 1991, 28 (01) :3-27
[10]  
Chung K. L., 1967, MARKOV CHAINS STATIO