Sample-path optimality and variance-minimization of average cost Markov control processes

被引:47
作者
Hernández-Lerma, O
Vega-Amaya, O
Carrasco, G
机构
[1] Inst Politecn Nacl, Ctr Invest & Estudios Avanzados, Dept Matemat, Mexico City 07000, DF, Mexico
[2] Sonoma State Univ, Dept Matemat, Hermosillo, Sonora, Mexico
[3] Univ Nacl Autonoma Mexico, Fac Ciencia, Dept Matemat, Mexico City 04510, DF, Mexico
关键词
(discrete-time) Markov control processes; average cost criteria; sample-path average cost; expected average cost; canonical policies; average variance;
D O I
10.1137/S0363012998340673
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
This paper studies several average-cost criteria for Markov control processes on Borel spaces with possibly unbounded costs. Under suitable hypotheses we show (i) the existence of a sample-path average cost (SPAC-) optimal stationary policy; (ii) a stationary policy is SPAC-optimal if and only if it is expected average cost (EAC-) optimal; and (iii) within the class of stationary SPAC-optimal (equivalently, EAC-optimal) policies there exists one with a minimal limiting average variance.
引用
收藏
页码:79 / 93
页数:15
相关论文
共 37 条
  • [31] Mandl P., 1973, Kybernetika, V9, P237
  • [32] Mandl P., 1971, Kybernetika, V7, P1
  • [33] Puterman M.L., 2008, Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics
  • [34] Tijms H. C., 1986, STOCHASTIC MODELLING
  • [35] VEGAAMAYA O, IN PRESS APPL MATH W
  • [36] VEGAAMAYA O, 1998, THEIS UAM IZTAPALAPA
  • [37] Yushkevich A. A., 1973, Theory of Probability and Its Applications, V18, P777, DOI 10.1137/1118099