Markov decision processes with constrained stopping times

被引：0

作者：

Horiguchi, M ^{[1
]}

Kurano, M ^{[1
]}

Yasuda, M ^{[1
]}

机构：

[1] Chiba Univ, Div Math Sci & Phys, Grad Sch Sci & Technol, Inage Ku, Chiba 2638522, Japan

来源：

PROCEEDINGS OF THE 39TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5 | 2000年

关键词：

Markov decision process; constrained stopping time; Lagrange multiplier; OLA policy;

D O I：

暂无

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The optimization problem for a stopped Markov decision process is considered to be taken over stopping times tau constrained so that E-tau less than or equal to alpha for some fixed alpha > 0. We introduce the concept of a randomized stationary stopping time which is a mixed extension of the entry time of a stopping region and prove the existence of an optimal constrained pair of stationary policy and stopping time by utilizing a Lagrange multiplier approach. Also, applying the idea of the onestep look ahead (OLA) policy the optimal constrained pair is sought concretely. As an example, constrained Markov deteriorating system is explained.

引用

页码：706 / 710

页数：5

共 50 条

[1] On constrained Markov decision processes
Department of Econometrics, University of Sydney, Sydney, NSW 2006, Australia
不详
Oper Res Lett, 1 (25-28):
[2] On constrained Markov decision processes
Haviv, M
OPERATIONS RESEARCH LETTERS, 1996, 19 (01) : 25 - 28
[3] STOPPING TIMES FOR RECURRENT MARKOV-PROCESSES
BAXTER, JR
CHACON, RV
ILLINOIS JOURNAL OF MATHEMATICS, 1976, 20 (03) : 467 - 475
[4] Markov decision processes with a stopping time constraint
Horiguchi, M
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2001, 53 (02) : 279 - 295
[5] Markov decision processes with a stopping time constraint
Masayuki Horiguchi
Mathematical Methods of Operations Research, 2001, 53 : 279 - 295
[6] Learning in Constrained Markov Decision Processes
Singh, Rahul
Gupta, Abhishek
Shroff, Ness B.
IEEE TRANSACTIONS ON CONTROL OF NETWORK SYSTEMS, 2023, 10 (01): : 441 - 453
[7] Dynamic programming in constrained Markov decision processes
Piunovskiy, A. B.
CONTROL AND CYBERNETICS, 2006, 35 (03): : 645 - 660
[8] Robustness of policies in constrained Markov decision processes
Zadorojniy, A
Shwartz, A
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (04) : 635 - 638
[9] Reinforcement Learning for Constrained Markov Decision Processes
Gattami, Ather
Bai, Qinbo
Aggarwal, Vaneet
24TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND STATISTICS (AISTATS), 2021, 130
[10] Relaxation for Constrained Decentralized Markov Decision Processes
Xu, Jie
AAMAS'16: PROCEEDINGS OF THE 2016 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS, 2016, : 1313 - 1314

← 1 2 3 4 5 →