On structural properties of optimal average cost functions in Markov decision processes with Borel spaces and universally measurable policies

被引：1

作者：

Yu, Huizhen ^{[1
]}

机构：

[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada

来源：

JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS | 2022年 / 509卷 / 01期

关键词：

Markov decision processes; Universally measurable policies; Average cost; Submartingales; Reachability; Recurrent Markov chains; MINIMUM PAIR; EQUATION; STATE; CONVERGENCE; EXISTENCE; CHAINS;

D O I：

10.1016/j.jmaa.2021.125954

中图分类号：

O29 [应用数学];

学科分类号：

070104 ;

摘要：

We consider Markov decision processes (MDPs) with Borel state and action spaces and universally measurable policies. For several long-run average cost criteria and two classes of MDPs, we prove sufficient conditions for the optimal average cost functions to be constant almost everywhere with respect to certain sigma-finite measures. Besides suitable boundedness conditions on the positive parts of the one-stage costs, the key condition here is that each subset of states with positive measure be reachable with probability one under some policy. Our proofs exploit an inequality for the optimal average cost functions and its connection with submartingales, and, in a special case that involves stationary policies, also use the theory of recurrent Markov chains. (c) 2021 Elsevier Inc. All rights reserved.

引用

页数：23

共 49 条

[1] AVERAGE COST OPTIMALITY INEQUALITY FOR MARKOV DECISION PROCESSES WITH BOREL SPACES AND UNIVERSALLY MEASURABLE POLICIES
Yu, Huizhen
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2020, 58 (04) : 2469 - 2502
[2] Optimal policies for constrained average-cost Markov decision processes
Gonzalez-Hernandez, Juan
Villarreal, Cesar E.
TOP, 2011, 19 (01) : 107 - 120
[3] Optimal policies for constrained average-cost Markov decision processes
Juan González-Hernández
César E. Villarreal
TOP, 2011, 19 : 107 - 120
[4] Constrained average cost Markov control processes in Borel spaces
Hernández-Lerma, O
González-Hernández, J
López-Martínez, RR
SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2003, 42 (02) : 442 - 468
[5] Constrained Markov decision processes in Borel spaces: from discounted to average optimality
Mendoza-Perez, Armando F.
Jasso-Fuentes, Hector
De-la-Cruz Courtois, Omar A.
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2016, 84 (03) : 489 - 525
[6] A PERTURBATION APPROACH TO APPROXIMATE VALUE ITERATION FOR AVERAGE COST MARKOV DECISION PROCESSES WITH BOREL SPACES AND BOUNDED COSTS
Vega-Amaya, Oscar
Lopez-Borbon, Joaqun
KYBERNETIKA, 2019, 55 (01) : 81 - 113
[7] Value iteration in average cost Markov control processes on borel spaces
MontesdeOca, R
HernandezLerma, O
ACTA APPLICANDAE MATHEMATICAE, 1996, 42 (02) : 203 - 222
[8] Policy iteration for average cost Markov control processes on Borel spaces
HernandezLerma, O
Lasserre, JB
ACTA APPLICANDAE MATHEMATICAE, 1997, 47 (02) : 125 - 154
[9] Policy Iteration for Average Cost Markov Control Processes on Borel Spaces
Onésimo Hernández-Lerma
Jean B. Lasserre
Acta Applicandae Mathematica, 1997, 47 : 125 - 154
[10] AVERAGE COST OPTIMAL POLICIES FOR MARKOV CONTROL PROCESSES WITH BOREL STATE-SPACE AND UNBOUNDED COSTS
HERNANDEZLERMA, O
LASSERRE, JB
SYSTEMS & CONTROL LETTERS, 1990, 15 (04) : 349 - 356

← 1 2 3 4 5 →