Maximizing the set of recurrent states of an MDP subject to convex constraints

被引：3

作者：

Arvelo, Eduardo ^{[1
]}

Martins, Nuno C. ^{[1
]}

机构：

[1] Univ Maryland, Dept Elect & Comp Engn, College Pk, MD 20742 USA

来源：

AUTOMATICA | 2014年 / 50卷 / 03期

关键词：

Maximum entropy; Markov decision problems; Markov models; Convex optimization; Optimal control; CONTROLLED MARKOV-CHAINS;

D O I：

10.1016/j.automatica.2014.01.002

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

This paper focuses on the design of time-homogeneous fully observed Markov decision processes (MDPs), with finite state and action spaces. The main objective is to obtain policies that generate the maximal set of recurrent states, subject to convex constraints on the set of invariant probability mass functions. We propose a design method that relies on a finitely parametrized convex program inspired on principles of entropy maximization. A numerical example is provided to illustrate these ideas. (C) 2014 Published by Elsevier Ltd.

引用

页码：994 / 998

页数：5

共 21 条

[11] MARKOV RENEWAL PROGRAMMING BY LINEAR FRACTIONAL PROGRAMMING
FOX, B
[J]. SIAM JOURNAL ON APPLIED MATHEMATICS, 1966, 14 (06) : 1418 - &
[12] A probabilistic language formalism for stochastic discrete-event systems
Garg, VK
Kumar, R
Marcus, SI
[J]. IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 1999, 44 (02) : 280 - 293
[13] LINEAR-PROGRAMMING AND MARKOV DECISION CHAINS
HORDIJK, A
KALLENBERG, LCM
[J]. MANAGEMENT SCIENCE, 1979, 25 (04) : 352 - 362
[14] Kumar P.R., 1986, ESTIMATION IDENTIFIC
[15] Puterman M.L., 2008, Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Statistics
[16] LINEAR-PROGRAMMING IN A MARKOV-CHAIN
WOLFE, P
DANTZIG, GB
[J]. OPERATIONS RESEARCH, 1962, 10 (05) : 702 - 710
[17] On non-stationary policies and maximal invariant safe sets of controlled Markov chains
Wu, W
Arapostathis, A
Kumar, R
[J]. 2004 43RD IEEE CONFERENCE ON DECISION AND CONTROL (CDC), VOLS 1-5, 2004, : 3696 - 3701
[18] [No title captured]
[19] [No title captured]
[20] [No title captured]

← 1 2 3 →