Approximation of average cost optimal policies for general Markov decision processes with unbounded costs

被引：0

作者：

Evgueni Gordienko

Raúl Montes-De-Oca

Adolfo Minjárez-Sosa

机构：

[1] Universidad Autónoma Metropolitana — Iztapalapa,Departamento de Matemáticas

[2] Universidad de Sonora,Departamento de Matemáticas

来源：

Mathematical Methods of Operations Research | 1997年 / 45卷

关键词：

Markov Decision Process; Average Cost Criterion; Value Iteration; Approximation of Optimal Policy; Geometrical Convergence;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

The aim of the paper is to show that Lyapunov-like ergodicity conditions on Markov decision processes with Borel state space and possibly unbounded cost provide the approximation of an average cost optimal policy by solvingn-stage optimization problems (n = 1, 2, ...). The used approach ensures the exponential rate of convergence. The approximation of this type would be useful to find adaptive procedures of control and to estimate stability of an optimal control under disturbances of the transition probability.

引用

页码：245 / 263

页数：18

共 50 条

[31] A Discount Vanishing Approximation for Markov Decision Processes with Risk Sensitivity
Huang, Tanhao
Lu, Xiaoyang
Chen, Jinwen
JOURNAL OF DYNAMICAL AND CONTROL SYSTEMS, 2024, 30 (02)
[32] A COUNTEREXAMPLE ON THE OPTIMALITY EQUATION IN MARKOV DECISION CHAINS WITH THE AVERAGE COST CRITERION
CAVAZOSCADENA, R
SYSTEMS & CONTROL LETTERS, 1991, 16 (05) : 387 - 392
[33] Adaptive aggregation for reinforcement learning in average reward Markov decision processes
Ronald Ortner
Annals of Operations Research, 2013, 208 : 321 - 336
[34] Average Reward Reinforcement Learning for Semi-Markov Decision Processes
Yang, Jiayuan
Li, Yanjie
Chen, Haoyao
Li, Jiangang
NEURAL INFORMATION PROCESSING, ICONIP 2017, PT I, 2017, 10634 : 768 - 777
[35] Adaptive aggregation for reinforcement learning in average reward Markov decision processes
Ortner, Ronald
ANNALS OF OPERATIONS RESEARCH, 2013, 208 (01) : 321 - 336
[36] THE AVERAGE COST OPTIMALITY EQUATION FOR MARKOV CONTROL PROCESSES ON BOREL SPACES
MONTESDEOCA, R
SYSTEMS & CONTROL LETTERS, 1994, 22 (05) : 351 - 357
[37] Optimization of Parametric Policies of Markov Decision Processes under a Variance Criterion
Xia, Li
2016 13TH INTERNATIONAL WORKSHOP ON DISCRETE EVENT SYSTEMS (WODES), 2016, : 332 - 337
[38] Game Theoretic Markov Decision Processes for Optimal Decision Making in Social Systems
Chen, Yan
Gao, Yang
Jiang, Chunxiao
Liu, K. J. Ray
2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 268 - 272
[39] Perceptive evaluation for the optimal discounted reward in Markov decision processes
Kurano, M
Yasuda, M
Nakagami, J
Yoshida, Y
MODELING DECISIONS FOR ARTIFICIAL INTELLIGENCE, PROCEEDINGS, 2005, 3558 : 283 - 293
[40] Optimal policy for minimizing risk models in Markov decision processes
Ohtsubo, Y
Toyonaga, K
JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2002, 271 (01) : 66 - 81

← 1 2 3 4 5 →