The Design of ϵ-Optimal Strategy for Two-Person Zero-Sum Markov Games

被引：0

作者：

Xie, Kaiyun ^{[1
]}

Xiong, Junlin ^{[1
]}

机构：

[1] Univ Sci & Technol China, Dept Automat, Hefei 230022, Peoples R China

来源：

IEEE CONTROL SYSTEMS LETTERS | 2024年 / 8卷

基金：

中国国家自然科学基金;

关键词：

Games; Costs; Nash equilibrium; Convergence; Heuristic algorithms; Standards; Q-learning; Estimation; Approximation algorithms; Vectors; Gauss-Seidel iteration; Markov game; receding horizon; imprecision; & varepsilon; -optimal strategy;

D O I：

10.1109/LCSYS.2024.3474057

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This letter focuses on designing approximate Nash strategies for the two-person zero-sum Markov game. Using the receding horizon method, the & varepsilon;-optimal strategies are designed to approximate Nash strategies by executing finite Gauss-Seidel iterations. The relationship between the approximation value of & varepsilon; and the number of iterations is also analyzed. Additionally, the & varepsilon;-optimal strategies are designed for two scenarios with imprecise parameters. For scenarios with imprecise values, the value of & varepsilon; is determined based on the errors between imprecise and iteration values. It provides a theoretical basis for efficiently designing & varepsilon;-optimal strategies using heuristic algorithms or approximate dynamic programming. For scenarios with imprecise transition probabilities, the value of & varepsilon; is determined based on the errors between the estimated and practical transition probabilities. It enables the use of pattern recognition technology or other methods to estimate practical transition probabilities for designing & varepsilon;-optimal strategies.

引用

页码：2349 / 2354

页数：6

共 50 条

[1] Two-person zero-sum Markov games: Receding horizon approach
Chang, HS
Marcus, SI
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2003, 48 (11) : 1951 - 1961
[2] Value set iteration for two-person zero-sum Markov games
Chang, Hyeong Soo
AUTOMATICA, 2017, 76 : 61 - 64
[3] TWO-PERSON ZERO-SUM STOCHASTIC GAMES
Baykal-Guersoy, Melike
ANNALS OF OPERATIONS RESEARCH, 1991, 28 (01) : 135 - 152
[4] A perturbation on two-person zero-sum games
Kimura, Y
Sawasaki, Y
Tanaka, K
ADVANCES IN DYNAMIC GAMES AND APPLICATIONS, 2000, 5 : 279 - 288
[5] On the solution of two-person zero-sum matrix games
Stefanov, Stefan M.
JOURNAL OF INFORMATION & OPTIMIZATION SCIENCES, 2024, 45 (03): : 649 - 657
[6] 'TWO-PERSON/ZERO-SUM'
NEMEROV, H
KENYON REVIEW, 1986, 8 (03): : 74 - 74
[7] On characterization of equilibrium strategy of two-person zero-sum games with fuzzy payoffs
Maeda, T
FUZZY SETS AND SYSTEMS, 2003, 139 (02) : 283 - 296
[8] ON ZERO-SUM TWO-PERSON UNDISCOUNTED SEMI-MARKOV GAMES WITH A MULTICHAIN STRUCTURE
Mondal, Prasenjit
ADVANCES IN APPLIED PROBABILITY, 2017, 49 (03) : 826 - 849
[9] Linear Programming and Zero-Sum Two-Person Undiscounted Semi-Markov Games
Mondal, Prasenjit
ASIA-PACIFIC JOURNAL OF OPERATIONAL RESEARCH, 2015, 32 (06)
[10] Perfect information two-person zero-sum markov games with imprecise transition probabilities
Chang, Hyeong Soo
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2006, 64 (02) : 335 - 351

← 1 2 3 4 5 →