Policy iteration for robust nonstationary Markov decision processes

被引:0
|
作者
Saumya Sinha
Archis Ghate
机构
[1] University of Washington,
来源
Optimization Letters | 2016年 / 10卷
关键词
Bellman’s equations; Infinite-dimensional optimization;
D O I
暂无
中图分类号
学科分类号
摘要
Policy iteration is a well-studied algorithm for solving stationary Markov decision processes (MDPs). It has also been extended to robust stationary MDPs. For robust nonstationary MDPs, however, an “as is” execution of this algorithm is not possible because it would call for an infinite amount of computation in each iteration. We therefore present a policy iteration algorithm for robust nonstationary MDPs, which performs finitely implementable approximate variants of policy evaluation and policy improvement in each iteration. We prove that the sequence of cost-to-go functions produced by this algorithm monotonically converges pointwise to the optimal cost-to-go function; the policies generated converge subsequentially to an optimal policy.
引用
收藏
页码:1613 / 1628
页数:15
相关论文
共 50 条
  • [1] Policy iteration for robust nonstationary Markov decision processes
    Sinha, Saumya
    Ghate, Archis
    OPTIMIZATION LETTERS, 2016, 10 (08) : 1613 - 1628
  • [2] Partial policy iteration for L1-Robust Markov decision processes
    Ho, Chin Pang
    Petrik, Marek
    Wiesemann, Wolfram
    Journal of Machine Learning Research, 2021, 22
  • [3] Geometric Policy Iteration for Markov Decision Processes
    Wu, Yue
    De Loera, Jesus A.
    PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2070 - 2078
  • [4] Robust topological policy iteration for infinite horizon bounded Markov Decision Processes
    Silva Reis, Willy Arthur
    de Barros, Leliane Nunes
    Delgado, Karina Valdivia
    INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2019, 105 : 287 - 304
  • [5] Policy set iteration for Markov decision processes
    Chang, Hyeong Soo
    AUTOMATICA, 2013, 49 (12) : 3687 - 3689
  • [6] Efficient Policy Iteration for Periodic Markov Decision Processes
    Osogami, Takayuki
    Raymond, Rudy
    21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1167 - 1172
  • [7] Evolutionary policy iteration for solving Markov decision processes
    Chang, HS
    Lee, HG
    Fu, MC
    Marcus, SI
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) : 1804 - 1808
  • [8] The Smoothed Complexity of Policy Iteration for Markov Decision Processes
    Christ, Miranda
    Yannakakis, Mihalis
    PROCEEDINGS OF THE 55TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, STOC 2023, 2023, : 1890 - 1903
  • [9] Policy Iteration for Decentralized Control of Markov Decision Processes
    Bernstein, Daniel S.
    Amato, Christopher
    Hansen, Eric A.
    Zilberstein, Shlomo
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 : 89 - 132
  • [10] A study of value iteration and policy iteration for Markov decision processes in Deterministic systems
    Zheng, Haifeng
    Wang, Dan
    AIMS MATHEMATICS, 2024, 9 (12): : 33818 - 33842