Illustrated review of convergence conditions of the value iteration algorithm and the rolling horizon procedure for average-cost MDPs

被引:0
|
作者
Eugenio Della Vecchia
Silvia Di Marco
Alain Jean-Marie
机构
[1] CONICET-UNR,
[2] INRIA-LIRMM,undefined
来源
关键词
Markov decision problems; Value iteration; Heuristic methods; Rolling horizon;
D O I
暂无
中图分类号
学科分类号
摘要
This paper is concerned with the links between the Value Iteration algorithm and the Rolling Horizon procedure, for solving problems of stochastic optimal control under the long-run average criterion, in Markov Decision Processes with finite state and action spaces. We review conditions of the literature which imply the geometric convergence of Value Iteration to the optimal value. Aperiodicity is an essential prerequisite for convergence. We prove that the convergence of Value Iteration generally implies that of Rolling Horizon. We also present a modified Rolling Horizon procedure that can be applied to models without analyzing periodicity, and discuss the impact of this transformation on convergence. We illustrate with numerous examples the different results. Finally, we discuss rules for stopping Value Iteration or finding the length of a Rolling Horizon. We provide an example which demonstrates the difficulty of the question, disproving in particular a conjectured rule proposed by Puterman.
引用
收藏
页码:193 / 214
页数:21
相关论文
共 7 条
  • [1] Illustrated review of convergence conditions of the value iteration algorithm and the rolling horizon procedure for average-cost MDPs
    Della Vecchia, Eugenio
    Di Marco, Silvia
    Jean-Marie, Alain
    ANNALS OF OPERATIONS RESEARCH, 2012, 199 (01) : 193 - 214
  • [2] An Empirical Algorithm for Relative Value Iteration for Average-cost MDPs
    Gupta, Abhishek
    Jain, Rahul
    Glynn, Peter W.
    2015 54TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2015, : 5079 - 5084
  • [3] The convergence of value iteration in average cost Markov decision chains
    Sennott, LI
    OPERATIONS RESEARCH LETTERS, 1996, 19 (01) : 11 - 16
  • [4] On the Average-Cost Optimality Equations and Convergence of Discounted-Cost Relative Value Functions for Inventory Control Problems with Quasiconvex Cost Functions
    Feinberg, Eugene A.
    Liang, Yan
    2017 IEEE 56TH ANNUAL CONFERENCE ON DECISION AND CONTROL (CDC), 2017,
  • [5] Toward an Optimized Value Iteration Algorithm for Average Cost Markov Decision Processes
    Arruda, Edilson F.
    Ourique, Fabricio
    Almudevar, Anthony
    49TH IEEE CONFERENCE ON DECISION AND CONTROL (CDC), 2010, : 930 - 934
  • [6] Finite Convergence of Value Iteration Algorithm for Discounted Innite Horizon Optimal Control of Stochastic Logical Systems
    Wu, Yuhu
    Sun, Ximing
    Wang, Wei
    Shen, Tielong
    PROCEEDINGS OF THE 35TH CHINESE CONTROL CONFERENCE 2016, 2016, : 216 - 222
  • [7] Value iteration in a class of average controlled Markov chains with unbounded costs: Necessary and sufficient conditions for pointwise convergence
    CavazosCadena, R
    FernandezGaucherand, E
    JOURNAL OF APPLIED PROBABILITY, 1996, 33 (04) : 986 - 1002