On structural properties of optimal average cost functions in Markov decision processes with Borel spaces and universally measurable policies

被引:1
|
作者
Yu, Huizhen [1 ]
机构
[1] Univ Alberta, Dept Comp Sci, Edmonton, AB, Canada
关键词
Markov decision processes; Universally measurable policies; Average cost; Submartingales; Reachability; Recurrent Markov chains; MINIMUM PAIR; EQUATION; STATE; CONVERGENCE; EXISTENCE; CHAINS;
D O I
10.1016/j.jmaa.2021.125954
中图分类号
O29 [应用数学];
学科分类号
070104 ;
摘要
We consider Markov decision processes (MDPs) with Borel state and action spaces and universally measurable policies. For several long-run average cost criteria and two classes of MDPs, we prove sufficient conditions for the optimal average cost functions to be constant almost everywhere with respect to certain sigma-finite measures. Besides suitable boundedness conditions on the positive parts of the one-stage costs, the key condition here is that each subset of states with positive measure be reachable with probability one under some policy. Our proofs exploit an inequality for the optimal average cost functions and its connection with submartingales, and, in a special case that involves stationary policies, also use the theory of recurrent Markov chains. (c) 2021 Elsevier Inc. All rights reserved.
引用
收藏
页数:23
相关论文
共 49 条
  • [21] Finite-State Approximations to Constrained Markov Decision Processes with Borel Spaces
    Saldi, Naci
    Yuksel, Serdar
    Linder, Tamas
    2015 53RD ANNUAL ALLERTON CONFERENCE ON COMMUNICATION, CONTROL, AND COMPUTING (ALLERTON), 2015, : 567 - 572
  • [22] Toward an Optimized Value Iteration Algorithm for Average Cost Markov Decision Processes
    Arruda, Edilson F.
    Ourique, Fabricio
    Almudevar, Anthony
    49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 930 - 934
  • [23] Average control of Markov decision processes with Feller transition probabilities and general action spaces
    Costa, O. L. V.
    Dufour, F.
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2012, 396 (01) : 58 - 69
  • [24] RISK-SENSITIVE AVERAGE MARKOV DECISION PROCESSES IN GENERAL SPACES
    Chen, Xian
    Wei, Qingda
    SIAM JOURNAL ON CONTROL AND OPTIMIZATION, 2024, 62 (04) : 2115 - 2147
  • [25] Finite-State Approximation of Markov Decision Processes with Unbounded Costs and Borel Spaces
    Saldi, Naci
    Yuksel, Serdar
    Linder, Tumas
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 5085 - 5090
  • [26] Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities
    Dufour, Francois
    Prieto-Rumeau, Tomas
    STOCHASTICS-AN INTERNATIONAL JOURNAL OF PROBABILITY AND STOCHASTIC PROCESSES, 2015, 87 (02) : 273 - 307
  • [27] Reinforcement learning based algorithms for average cost Markov Decision Processes
    Abdulla, Mohammed Shahid
    Bhatnagar, Shalabh
    DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2007, 17 (01): : 23 - 52
  • [28] Reinforcement Learning Based Algorithms for Average Cost Markov Decision Processes
    Mohammed Shahid Abdulla
    Shalabh Bhatnagar
    Discrete Event Dynamic Systems, 2007, 17 : 23 - 52
  • [29] Incremental Improvements of Heuristic Policies for Average-Reward Markov Decision Processes
    Reveliotis, S.
    Ibrahim, M.
    IFAC PAPERSONLINE, 2020, 53 (02): : 1721 - 1728
  • [30] Policy Iteration for Continuous-Time Average Reward Markov Decision Processes in Polish Spaces
    Zhu, Quanxin
    Yang, Xinsong
    Huang, Chuangxia
    ABSTRACT AND APPLIED ANALYSIS, 2009,