Empirical approximation of Nash equilibria in finite Markov games with discounted payoffs

被引:1
作者
Robles-Aguilar, Alan D. [1 ]
Gonzalez-Sanchez, David [2 ]
Adolfo Minjarez-Sosa, J. [3 ]
机构
[1] Inst Tecnol Sonora, Dept Maternat, Obregon, Mexico
[2] CONACYT Univ Sonora, Catedras CONACYT, Hermosillo, Sonora, Mexico
[3] Univ Sonora, Dept Matemat, Hermosillo, Sonora, Mexico
关键词
discounted criterion; empirical estimation; Markov games; Nash equilibrium; EFFICIENT ADAPTIVE STRATEGIES; STOCHASTIC GAMES;
D O I
10.1002/asjc.2932
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper deals with finite nonzero-sum Markov games under a discounted optimality criterion and infinite horizon. The state process evolves according to a stochastic difference equation and depends on players' actions as well as a random disturbance whose distribution is unknown to the players. The actions, the states, and the values of the disturbance are observed by the players, then they use the empirical distribution of the disturbances to estimate the true distribution and make choices based on the available information. In this context, we propose an almost surely convergent procedure-possibly after passing to a subsequence-to approximate Nash equilibria of the Markov game with the true distribution of the random disturbance.
引用
收藏
页码:722 / 734
页数:13
相关论文
共 26 条
[1]   Two person zero-sum semi-markov games with unknown holding times distribution on one side:: A discounted payoff criterion [J].
Adolfo Minjarez-Sosa, J. ;
Luque-Vasquez, Fernando .
APPLIED MATHEMATICS AND OPTIMIZATION, 2008, 57 (03) :289-305
[2]   Optimal strategies for adaptive zero-sum average Markov games [J].
Adolfo Minjarez-Sosa, J. ;
Vega-Amaya, Oscar .
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2013, 402 (01) :44-56
[3]   ASYMPTOTICALLY OPTIMAL STRATEGIES FOR ADAPTIVE ZERO-SUM DISCOUNTED MARKOV GAMES [J].
Adolfo Minjarez-Sosa, J. ;
Vega-Amaya, Oscar .
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2009, 48 (03) :1405-1421
[6]   THE INVENTORY PROBLEM: II. CASE OF UNKNOWN DISTRIBUTIONS OF DEMAND [J].
Dvoretzky, A. ;
Kiefer, J. ;
Wolfowitz, J. .
ECONOMETRICA, 1952, 20 (03) :450-466
[7]  
DYNKIN E. B., 1979, Controlled Markov Processes
[8]   Zero-sum stochastic games with partial information [J].
Ghosh, MK ;
McDonald, D ;
Sinha, S .
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2004, 121 (01) :99-118
[9]  
Harsanyi J.C., 1988, GEN THEORY EQUILIBRI
[10]  
Hernandez-Lerma Onesimo, 2012, DISCRETE TIME MARKOV, V30