Policy iteration for robust nonstationary Markov decision processes

被引：0

作者：

Saumya Sinha

Archis Ghate

机构：

[1] University of Washington,

来源：

Optimization Letters | 2016年 / 10卷

关键词：

Bellman’s equations; Infinite-dimensional optimization;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

Policy iteration is a well-studied algorithm for solving stationary Markov decision processes (MDPs). It has also been extended to robust stationary MDPs. For robust nonstationary MDPs, however, an “as is” execution of this algorithm is not possible because it would call for an infinite amount of computation in each iteration. We therefore present a policy iteration algorithm for robust nonstationary MDPs, which performs finitely implementable approximate variants of policy evaluation and policy improvement in each iteration. We prove that the sequence of cost-to-go functions produced by this algorithm monotonically converges pointwise to the optimal cost-to-go function; the policies generated converge subsequentially to an optimal policy.

引用

页码：1613 / 1628

页数：15

共 50 条

[1] Policy iteration for robust nonstationary Markov decision processes
Sinha, Saumya
Ghate, Archis
OPTIMIZATION LETTERS, 2016, 10 (08) : 1613 - 1628
[2] Partial policy iteration for L1-Robust Markov decision processes
Ho, Chin Pang
Petrik, Marek
Wiesemann, Wolfram
Journal of Machine Learning Research, 2021, 22
[3] Geometric Policy Iteration for Markov Decision Processes
Wu, Yue
De Loera, Jesus A.
PROCEEDINGS OF THE 28TH ACM SIGKDD CONFERENCE ON KNOWLEDGE DISCOVERY AND DATA MINING, KDD 2022, 2022, : 2070 - 2078
[4] Robust topological policy iteration for infinite horizon bounded Markov Decision Processes
Silva Reis, Willy Arthur
de Barros, Leliane Nunes
Delgado, Karina Valdivia
INTERNATIONAL JOURNAL OF APPROXIMATE REASONING, 2019, 105 : 287 - 304
[5] Policy set iteration for Markov decision processes
Chang, Hyeong Soo
AUTOMATICA, 2013, 49 (12) : 3687 - 3689
[6] Efficient Policy Iteration for Periodic Markov Decision Processes
Osogami, Takayuki
Raymond, Rudy
21ST EUROPEAN CONFERENCE ON ARTIFICIAL INTELLIGENCE (ECAI 2014), 2014, 263 : 1167 - 1172
[7] Evolutionary policy iteration for solving Markov decision processes
Chang, HS
Lee, HG
Fu, MC
Marcus, SI
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (11) : 1804 - 1808
[8] The Smoothed Complexity of Policy Iteration for Markov Decision Processes
Christ, Miranda
Yannakakis, Mihalis
PROCEEDINGS OF THE 55TH ANNUAL ACM SYMPOSIUM ON THEORY OF COMPUTING, STOC 2023, 2023, : 1890 - 1903
[9] Policy Iteration for Decentralized Control of Markov Decision Processes
Bernstein, Daniel S.
Amato, Christopher
Hansen, Eric A.
Zilberstein, Shlomo
JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2009, 34 : 89 - 132
[10] A study of value iteration and policy iteration for Markov decision processes in Deterministic systems
Zheng, Haifeng
Wang, Dan
AIMS MATHEMATICS, 2024, 9 (12): : 33818 - 33842

← 1 2 3 4 5 →