A Special Case of Partially Observable Markov Decision Processes Problem by Event-Based Optimization

被引:0
作者
Zhang, Junyu [1 ]
机构
[1] Sun Yat Sen Univ, Sch Math & Computat Sci, Guangzhou 510275, Guangdong, Peoples R China
来源
PROCEEDINGS 2016 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL TECHNOLOGY (ICIT) | 2016年
关键词
D O I
暂无
中图分类号
TP301 [理论、方法];
学科分类号
081202 ;
摘要
In this paper, we discuss a kind of partially observable Markov decision process (POMDP) problem by the event-based optimization which is proposed in [4]. A POMDP ([7] and [8]) is a generalization of a standard completely observable Markov decision process that allows imperfect information about states of the system. Policy iteration algorithms for POMDPs have proved to be impractical as it is very difficult to implement. Thus, most work with POMDPs has used value iteration. But for a special case of POMDP, we can formulate it to an MDP problem. Then we can use our sensitivity view to derive the corresponding average reward difference formula. Based on that and the idea of event-based optimization, we use a single sample path to estimate aggregated potentials. Then we develop policy iteration (PI) algorithms.
引用
收藏
页码:1522 / 1526
页数:5
相关论文
共 14 条
  • [1] Event-based optimization of Markov systems
    Cao, Xi-Ren
    Zhang, Junyu
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2008, 53 (04) : 1076 - 1082
  • [2] The nth-order bias optimality for multichain Markov decision processes
    Cao, Xi-Ren
    Zhang, Junyu
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2008, 53 (02) : 496 - 508
  • [3] Partial-Information State-Based Optimization of Partially Observable Markov Decision Processes and the Separation Principle
    Cao, Xi-Ren
    Wang, De-Xin
    Qiu, Li
    [J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2014, 59 (04) : 921 - 936
  • [4] Basic ideas for event-based optimization of Markov systems
    Cao, XR
    [J]. DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2005, 15 (02): : 169 - 197
  • [5] The relations among potentials, perturbation analysis, and Markov decision processes
    Cao, XR
    [J]. DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 1998, 8 (01): : 71 - 87
  • [6] Cheng H., 2014, ROB AUT ICRA 2014 IE
  • [7] Jaakkola T., 1995, Advances in Neural Information Processing Systems 7, P345
  • [8] Planning and acting in partially observable stochastic domains
    Kaelbling, LP
    Littman, ML
    Cassandra, AR
    [J]. ARTIFICIAL INTELLIGENCE, 1998, 101 (1-2) : 99 - 134
  • [9] Controlling the continuos positive airway pressure-device using partial observable Markov decision processes
    Kreutz, C
    Honerkamp, J
    [J]. MODELLING, SIMULATION AND OPTIMIZATION OF COMPLEX PROCESSES, 2005, : 273 - 286
  • [10] Littman M. L., 1995, Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, P362