MINIMIZING RISK PROBABILITY FOR INFINITE DISCOUNTED PIECEWISE DETERMINISTIC MARKOV DECISION PROCESSES

被引:0
|
作者
Huo, Haifeng [1 ]
Cui, Jinhua [1 ]
Wen, Xian [1 ]
机构
[1] Guangxi Univ Sci & Technol, Sch Sci, Liuzhou 545006, Peoples R China
基金
中国国家自然科学基金;
关键词
optimal policy; the value iteration algorithm; piecewise deterministic Markov decision processes; risk probability criterion; 1ST PASSAGE OPTIMALITY; TIME; MINIMIZATION; MODELS;
D O I
10.14736/kyb-2024-3-0357
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
The purpose of this paper is to study the risk probability problem for infinite horizon piecewise deterministic Markov decision processes (PDMDPs) with varying discount factors and unbounded transition rates. Different from the usual expected total rewards, we aim to minimize the risk probability that the total rewards do not exceed a given target value. Under the condition of the controlled state process being non-explosive is slightly weaker than the corresponding ones in the previous literature, we prove the existence and uniqueness of a solution to the optimality equation, and the existence of the risk probability optimal policy by using the value iteration algorithm. Finally, we provide two examples to illustrate our results, one of which explains and verifies our conditions and the other shows the computational results of the value function and the risk probability optimal policy.
引用
收藏
页码:357 / 378
页数:22
相关论文
共 50 条