Genetic algorithm methods for solving the best stationary policy of finite Markov decision processes

被引:4
作者
Chen, HH [1 ]
Jafari, AA [1 ]
机构
[1] New York Inst Technol, Dept Elect Engn & Comp Sci, Old Westbury, NY 11568 USA
来源
THIRTIETH SOUTHEASTERN SYMPOSIUM ON SYSTEM THEORY (SSST) | 1998年
关键词
D O I
10.1109/SSST.1998.660132
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
This paper describes a heuristic approach to solving the optimal stationary policy of the standard finite Markov Decision Processes (MDP). For a MDP problem, there is a so-called policy-improvement algorithm, which can be used to determine optimal policies. It starts at an arbitrary policy fg and produces a sequence of improvements f(1), f(2), f(3), ... f(k) until an optimal policy is reached. In this paper, we propose to utilize the Genetic Algorithm method to search the best policy that can be considered as an optimal policy. The method is a three-stage cyclic process consisting of a reproduction (selection), recombination (mating), and evaluation (survival of the fittest); and lastly, to terminate the process by setting a convergent condition. The highest fitness individual presents a best policy. In conclusion, the computational advantages of using the Genetic Algorithm methods are discussed.
引用
收藏
页码:538 / 543
页数:6
相关论文
empty
未找到相关数据