We consider Markov decision processes (MDPs) with Borel state and action spaces and universally measurable policies. For several long-run average cost criteria and two classes of MDPs, we prove sufficient conditions for the optimal average cost functions to be constant almost everywhere with respect to certain sigma-finite measures. Besides suitable boundedness conditions on the positive parts of the one-stage costs, the key condition here is that each subset of states with positive measure be reachable with probability one under some policy. Our proofs exploit an inequality for the optimal average cost functions and its connection with submartingales, and, in a special case that involves stationary policies, also use the theory of recurrent Markov chains. (c) 2021 Elsevier Inc. All rights reserved.
机构:
Univ Fed Rio de Janeiro, Grad Sch & Res Engn, Ind Engn Program, Alberto Luiz Coimbra Inst, BR-21941972 Rio De Janeiro, BrazilUniv Fed Rio de Janeiro, Grad Sch & Res Engn, Ind Engn Program, Alberto Luiz Coimbra Inst, BR-21941972 Rio De Janeiro, Brazil
Arruda, E. F.
Fragoso, M. D.
论文数: 0引用数: 0
h-index: 0
机构:
Natl Lab Sci Computat, Ctr Syst & Control, BR-25651075 Petropolis, RJ, BrazilUniv Fed Rio de Janeiro, Grad Sch & Res Engn, Ind Engn Program, Alberto Luiz Coimbra Inst, BR-21941972 Rio De Janeiro, Brazil
机构:
Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R ChinaSun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China
Guo, Xianping
Huang, Yonghui
论文数: 0引用数: 0
h-index: 0
机构:
Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R ChinaSun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China
Huang, Yonghui
Song, Xinyuan
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Stat, Shatin, Hong Kong, Peoples R ChinaSun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China