An adaptation of particle swarm optimization for Markov decision processes

被引:0
作者
Chang, HS [1 ]
机构
[1] Sogang Univ, Dept Comp Sci & Engn, Seoul, South Korea
来源
2004 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN & CYBERNETICS, VOLS 1-7 | 2004年
关键词
particle swarm optimization; Markov decision process; reinforcement learning;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, we adapt the metaheuristic Of particle swarm optimization (PSO) for solving nonstochastic optimization problems into a novel convergent algorithm for solving Markov Decision Processes (MDPs) with infinite horizon discounted cost criterion. We show that the algorithm converges to an optimal policy with probability one. We further study how to adapt PSO to develop a PSO-based reinforcement learning for the case where transition and cost dynamics of a given MDP are unknown to the decision maker.
引用
收藏
页码:1643 / 1648
页数:6
相关论文
共 16 条
  • [1] Bertsekas D., 2012, Dynamic Programming and Optimal Control, V1
  • [2] Bertsekas D., 1996, NEURO DYNAMIC PROGRA, V1st
  • [3] Bertsekas DP, 1995, Dynamic Programming and Optimal Control, V2
  • [4] Chang HS, 2004, P AMER CONTR CONF, P3820
  • [5] Parallel rollout for online solution of partially observable Markov decision processes
    Chang, HS
    Givan, R
    Chong, EKP
    [J]. DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2004, 14 (03): : 309 - 341
  • [6] CHANG HS, 2004, IEEE T AUTOMATIC CON
  • [7] CHANG HS, 2004, P IEEE C SYST MAN CY
  • [8] CHANG HS, 2003, P 42 IEEE C DEC CONT
  • [9] DEJONG KA, 1975, THESIS U MICH ANN AR
  • [10] Eberhart R.C., 2001, Swarm Intelligence