Empirical approximation of Nash equilibria in finite Markov games with discounted payoffs

被引：1

作者：

Robles-Aguilar, Alan D. ^{[1
]}

Gonzalez-Sanchez, David ^{[2
]}

Adolfo Minjarez-Sosa, J. ^{[3
]}

机构：

[1] Inst Tecnol Sonora, Dept Maternat, Obregon, Mexico

[2] CONACYT Univ Sonora, Catedras CONACYT, Hermosillo, Sonora, Mexico

[3] Univ Sonora, Dept Matemat, Hermosillo, Sonora, Mexico

来源：

ASIAN JOURNAL OF CONTROL | 2023年 / 25卷 / 02期

关键词：

discounted criterion; empirical estimation; Markov games; Nash equilibrium; EFFICIENT ADAPTIVE STRATEGIES; STOCHASTIC GAMES;

D O I：

10.1002/asjc.2932

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper deals with finite nonzero-sum Markov games under a discounted optimality criterion and infinite horizon. The state process evolves according to a stochastic difference equation and depends on players' actions as well as a random disturbance whose distribution is unknown to the players. The actions, the states, and the values of the disturbance are observed by the players, then they use the empirical distribution of the disturbances to estimate the true distribution and make choices based on the available information. In this context, we propose an almost surely convergent procedure-possibly after passing to a subsequence-to approximate Nash equilibria of the Markov game with the true distribution of the random disturbance.

引用

页码：722 / 734

页数：13

共 26 条

[1] Two person zero-sum semi-markov games with unknown holding times distribution on one side:: A discounted payoff criterion [J].

Adolfo Minjarez-Sosa, J. ;

Luque-Vasquez, Fernando .

APPLIED MATHEMATICS AND OPTIMIZATION, 2008, 57 (03) :289-305

[2] Optimal strategies for adaptive zero-sum average Markov games [J].

Adolfo Minjarez-Sosa, J. ;

Vega-Amaya, Oscar .

JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2013, 402 (01) :44-56

[3] ASYMPTOTICALLY OPTIMAL STRATEGIES FOR ADAPTIVE ZERO-SUM DISCOUNTED MARKOV GAMES [J].

Adolfo Minjarez-Sosa, J. ;

Vega-Amaya, Oscar .

SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2009, 48 (03) :1405-1421

[4] ON THE THEORY OF DYNAMIC PROGRAMMING [J].

BELLMAN, R .

PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA, 1952, 38 (08) :716-719

[5] Perfect information two-person zero-sum markov games with imprecise transition probabilities [J].

Chang, Hyeong Soo .

MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2006, 64 (02) :335-351

[6] THE INVENTORY PROBLEM: II. CASE OF UNKNOWN DISTRIBUTIONS OF DEMAND [J].

Dvoretzky, A. ;

Kiefer, J. ;

Wolfowitz, J. .

ECONOMETRICA, 1952, 20 (03) :450-466

[7]

DYNKIN E. B., 1979, Controlled Markov Processes

[8] Zero-sum stochastic games with partial information [J].

Ghosh, MK ;

McDonald, D ;

Sinha, S .

JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2004, 121 (01) :99-118

[9]

Harsanyi J.C., 1988, GEN THEORY EQUILIBRI

[10]

Hernandez-Lerma Onesimo, 2012, DISCRETE TIME MARKOV, V30

← 1 2 3 →