Markov decision processes with constrained stopping times

被引:0
|
作者
Horiguchi, M [1 ]
Kurano, M [1 ]
Yasuda, M [1 ]
机构
[1] Chiba Univ, Div Math Sci & Phys, Grad Sch Sci & Technol, Inage Ku, Chiba 2638522, Japan
来源
PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5 | 2000年
关键词
Markov decision process; constrained stopping time; Lagrange multiplier; OLA policy;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
The optimization problem for a stopped Markov decision process is considered to be taken over stopping times tau constrained so that E-tau less than or equal to alpha for some fixed alpha > 0. We introduce the concept of a randomized stationary stopping time which is a mixed extension of the entry time of a stopping region and prove the existence of an optimal constrained pair of stationary policy and stopping time by utilizing a Lagrange multiplier approach. Also, applying the idea of the onestep look ahead (OLA) policy the optimal constrained pair is sought concretely. As an example, constrained Markov deteriorating system is explained.
引用
收藏
页码:706 / 710
页数:5
相关论文
共 50 条
  • [1] On constrained Markov decision processes
    Department of Econometrics, University of Sydney, Sydney, NSW 2006, Australia
    不详
    Oper Res Lett, 1 (25-28):
  • [2] On constrained Markov decision processes
    Haviv, M
    OPERATIONS RESEARCH LETTERS, 1996, 19 (01) : 25 - 28
  • [3] STOPPING TIMES FOR RECURRENT MARKOV-PROCESSES
    BAXTER, JR
    CHACON, RV
    ILLINOIS JOURNAL OF MATHEMATICS, 1976, 20 (03) : 467 - 475
  • [4] Markov decision processes with a stopping time constraint
    Horiguchi, M
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2001, 53 (02) : 279 - 295
  • [5] Markov decision processes with a stopping time constraint
    Masayuki Horiguchi
    Mathematical Methods of Operations Research, 2001, 53 : 279 - 295
  • [6] Learning in Constrained Markov Decision Processes
    Singh, Rahul
    Gupta, Abhishek
    Shroff, Ness B.
    IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (01): : 441 - 453
  • [7] Dynamic programming in constrained Markov decision processes
    Piunovskiy, A. B.
    CONTROL AND CYBERNETICS, 2006, 35 (03): : 645 - 660
  • [8] Robustness of policies in constrained Markov decision processes
    Zadorojniy, A
    Shwartz, A
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (04) : 635 - 638
  • [9] Reinforcement Learning for Constrained Markov Decision Processes
    Gattami, Ather
    Bai, Qinbo
    Aggarwal, Vaneet
    24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
  • [10] Relaxation for Constrained Decentralized Markov Decision Processes
    Xu, Jie
    AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1313 - 1314