Time aggregated Markov decision processes via standard dynamic programming

被引:6
作者
Arruda, Edilson F. [2 ]
Fragoso, Marcelo D. [1 ]
机构
[1] Natl Lab Sci Computat LNCC, Ctr Syst & Control CSC, BR-25651075 Petropolis, RJ, Brazil
[2] Pontif Catholic Univ Rio Grande do Sul, Sch Engn, Porto Alegre, RS, Brazil
关键词
Markov decision processes; Time aggregation; Dynamic programming;
D O I
10.1016/j.orl.2011.03.006
中图分类号
C93 [管理学]; O22 [运筹学];
学科分类号
070105 ; 12 ; 1201 ; 1202 ; 120202 ;
摘要
This note addresses the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an intermediate step toward constructing a transformed MDP whose state space is comprised solely of the controllable states. The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms. (C) 2011 Elsevier B.V. All rights reserved.
引用
收藏
页码:193 / 197
页数:5
相关论文
共 9 条
[1]   Standard Dynamic Programming Applied to Time Aggregated Markov Decision Processes [J].
Arruda, Edilson F. ;
Fragoso, Marcelo D. .
PROCEEDINGS OF THE 48TH IEEE CONFERENCE ON DECISION AND CONTROL, 2009 HELD JOINTLY WITH THE 2009 28TH CHINESE CONTROL CONFERENCE (CDC/CCC 2009), 2009, :2576-2580
[2]   A time aggregation approach to Markov decision processes [J].
Cao, XR ;
Ren, ZY ;
Bhatnagar, S ;
Fu, M ;
Marcus, S .
AUTOMATICA, 2002, 38 (06) :929-943
[3]   Joint replacement in an operational planning phase [J].
Dekker, R ;
Wildeman, RE ;
vanEgmond, R .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1996, 91 (01) :74-88
[4]  
FAINBERG EA, 1986, THEORY PROBABILITY I, V31, P658
[5]  
Howard R., 1971, : Dynamic probabilistic systems, vol. ii: Semi-Markov and decision processes., Vii
[6]   Exact finite approximations of average-cost countable Markov decision processes [J].
Leizarowitz, Arie ;
Shwartz, Adam .
AUTOMATICA, 2008, 44 (06) :1480-1487
[7]  
Puterman M.L., 2008, Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics
[8]   Markov decision processes with fractional costs [J].
Ren, ZY ;
Krogh, BH .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2005, 50 (05) :646-650
[9]   Incremental value iteration for time-aggregated Markov-decision processes [J].
Sun, Tao ;
Zhao, Qianchuan ;
Luh, Peter B. .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2007, 52 (11) :2177-2182