Solving Ergodic Markov Decision Processes and Perfect Information Zero-sum Stochastic Games by Variance Reduced Deflated Value Iteration

被引:0
|
作者
Akian, Marianne [1 ,2 ]
Gaubert, Stephane [1 ,2 ]
Qu, Zheng [3 ]
Saadi, Omar [1 ,2 ]
机构
[1] Ecole Polytech, INRIA, F-91128 Palaiseau, France
[2] Ecole Polytech, CMAP, Route Saclay, F-91128 Palaiseau, France
[3] Univ Hong Kong, Dept Math, Room 419,Run Run Shaw Bldg,Pokfulam Rd, Hong Kong, Peoples R China
来源
2019 IEEE 58TH CONFERENCE ON DECISION AND CONTROL (CDC) | 2019年
关键词
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Recently, Sidford, Wang, Wu and Ye (2018) developed an algorithm combining variance reduction techniques with value iteration to solve discounted Markov decision processes. This algorithm has a sublinear complexity when the discount factor is fixed. Here, we extend this approach to mean-payoff problems, including both Markov decision processes and perfect information zero-sum stochastic games. We obtain sublinear complexity bounds, assuming there is a distinguished state which is accessible from all initial states and for all policies. Our method is based on a reduction from the mean payoff problem to the discounted problem by a Doob h-transform, combined with a deflation technique. The complexity analysis of this algorithm uses at the same time the techniques developed by Sidford et al. in the discounted case and non-linear spectral theory techniques (Collatz-Wielandt characterization of the eigenvalue).
引用
收藏
页码:5963 / 5970
页数:8
相关论文
共 50 条
  • [41] Zero-Sum Ergodic Semi-Markov Games with Weakly Continuous Transition Probabilities
    Jaskiewicz, A.
    JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS, 2009, 141 (02) : 321 - 347
  • [43] Uniform continuity of the value of zero-sum games with differential information
    Einy, Ezra
    Haimanko, Ori
    Moreno, Diego
    Shitovitz, Benyamin
    MATHEMATICS OF OPERATIONS RESEARCH, 2008, 33 (03) : 552 - 560
  • [44] Algorithms for uniform optimal strategies in two-player zero-sum stochastic games with perfect information
    Avrachenkov, Konstantin
    Cottatellucci, Laura
    Maggi, Lorenzo
    OPERATIONS RESEARCH LETTERS, 2012, 40 (01) : 56 - 60
  • [45] Value functions for depth-limited solving in zero-sum imperfect-information games
    Kovarik, Vojtech
    Seitz, Dominik
    Lisy, Viliam
    Rudolf, Jan
    Sun, Shuo
    Ha, Karel
    ARTIFICIAL INTELLIGENCE, 2023, 314
  • [46] CONTINUITY PROPERTIES OF VALUE FUNCTIONS IN INFORMATION STRUCTURES FOR ZERO-SUM AND GENERAL GAMES AND STOCHASTIC TEAMS*
    Hogeboom-Burr, Ian
    Yuksel, Serdar
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2023, 61 (02) : 398 - 414
  • [47] Zero-sum stochastic games with the average-value-at-risk criterion
    Liu, Qiuli
    Ching, Wai-Ki
    Guo, Xianping
    TOP, 2023, 31 (03) : 618 - 647
  • [48] Zero-sum stochastic games with the average-value-at-risk criterion
    Qiuli Liu
    Wai-Ki Ching
    Xianping Guo
    TOP, 2023, 31 : 618 - 647
  • [49] Efficient Computation of Discounted Asymmetric Information Zero-Sum Stochastic Games
    Li, Lichun
    Shamma, Jeff S.
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 4531 - 4536
  • [50] Solving zero-sum one-sided partially observable stochastic games
    Horak, Karel
    Bosansky, Branislav
    Kovarik, Vojtech
    Kiekintveld, Christopher
    ARTIFICIAL INTELLIGENCE, 2023, 316