Solving Ergodic Markov Decision Processes and Perfect Information Zero-sum Stochastic Games by Variance Reduced Deflated Value Iteration

被引：0

作者：

Akian, Marianne ^{[1
,2
]}

Gaubert, Stephane ^{[1
,2
]}

Qu, Zheng ^{[3
]}

Saadi, Omar ^{[1
,2
]}

机构：

[1] Ecole Polytech, INRIA, F-91128 Palaiseau, France

[2] Ecole Polytech, CMAP, Route Saclay, F-91128 Palaiseau, France

[3] Univ Hong Kong, Dept Math, Room 419,Run Run Shaw Bldg,Pokfulam Rd, Hong Kong, Peoples R China

来源：

2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC) | 2019年

关键词：

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Recently, Sidford, Wang, Wu and Ye (2018) developed an algorithm combining variance reduction techniques with value iteration to solve discounted Markov decision processes. This algorithm has a sublinear complexity when the discount factor is fixed. Here, we extend this approach to mean-payoff problems, including both Markov decision processes and perfect information zero-sum stochastic games. We obtain sublinear complexity bounds, assuming there is a distinguished state which is accessible from all initial states and for all policies. Our method is based on a reduction from the mean payoff problem to the discounted problem by a Doob h-transform, combined with a deflation technique. The complexity analysis of this algorithm uses at the same time the techniques developed by Sidford et al. in the discounted case and non-linear spectral theory techniques (Collatz-Wielandt characterization of the eigenvalue).

引用

页码：5963 / 5970

页数：8

共 50 条

[41] Zero-Sum Ergodic Semi-Markov Games with Weakly Continuous Transition Probabilities
Jaskiewicz, A.
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2009, 141 (02) : 321 - 347
[42] NON ZERO-SUM STOPPING GAMES OF SYMMETRICAL MARKOV-PROCESSES
NAGAI, H
PROBABILITY THEORY AND RELATED FIELDS, 1987, 75 (04) : 487 - 497
[43] Uniform continuity of the value of zero-sum games with differential information
Einy, Ezra
Haimanko, Ori
Moreno, Diego
Shitovitz, Benyamin
MATHEMATICS OF OPERATIONS RESEARCH, 2008, 33 (03) : 552 - 560
[44] Algorithms for uniform optimal strategies in two-player zero-sum stochastic games with perfect information
Avrachenkov, Konstantin
Cottatellucci, Laura
Maggi, Lorenzo
OPERATIONS RESEARCH LETTERS, 2012, 40 (01) : 56 - 60
[45] Value functions for depth-limited solving in zero-sum imperfect-information games
Kovarik, Vojtech
Seitz, Dominik
Lisy, Viliam
Rudolf, Jan
Sun, Shuo
Ha, Karel
ARTIFICIAL INTELLIGENCE, 2023, 314
[46] CONTINUITY PROPERTIES OF VALUE FUNCTIONS IN INFORMATION STRUCTURES FOR ZERO-SUM AND GENERAL GAMES AND STOCHASTIC TEAMS*
Hogeboom-Burr, Ian
Yuksel, Serdar
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2023, 61 (02) : 398 - 414
[47] Zero-sum stochastic games with the average-value-at-risk criterion
Liu, Qiuli
Ching, Wai-Ki
Guo, Xianping
TOP, 2023, 31 (03) : 618 - 647
[48] Zero-sum stochastic games with the average-value-at-risk criterion
Qiuli Liu
Wai-Ki Ching
Xianping Guo
TOP, 2023, 31 : 618 - 647
[49] Efficient Computation of Discounted Asymmetric Information Zero-Sum Stochastic Games
Li, Lichun
Shamma, Jeff S.
2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 4531 - 4536
[50] Solving zero-sum one-sided partially observable stochastic games
Horak, Karel
Bosansky, Branislav
Kovarik, Vojtech
Kiekintveld, Christopher
ARTIFICIAL INTELLIGENCE, 2023, 316

← 1 2 3 4 5 →