Optimally solving Markov decision processes with total expected discounted reward function: Linear programming revisited

被引：10

作者：

Alagoz, Oguzhan ^{[1
]}

Ayvaci, Mehmet U. S. ^{[2
]}

Linderoth, Jeffrey T. ^{[1
]}

机构：

[1] Univ Wisconsin, Dept Ind & Syst Engn, Madison, WI 53706 USA

[2] Univ Texas Dallas, Jindal Sch Management, Dallas, TX 75230 USA

来源：

COMPUTERS & INDUSTRIAL ENGINEERING | 2015年 / 87卷

基金：

美国国家科学基金会;

关键词：

Markov decision process; MDP; Linear programming; Policy iteration; Total expected discounted reward; Treatment optimization; LIVING-DONOR; CONVERGENCE; SYSTEM;

D O I：

10.1016/j.cie.2015.05.031

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

We compare the computational performance of linear programming (LP) and the policy iteration algorithm (PIA) for solving discrete-time infinite-horizon Markov decision process (MDP) models with total expected discounted reward. We use randomly generated test problems as well as a real-life health-care problem to empirically show that, unlike previously reported, barrier methods for LP provide a viable tool for optimally solving such MDPs. The dimensions of comparison include transition probability matrix structure, state and action size, and the LP solution method. (C) 2015 Elsevier Ltd. All rights reserved.

引用

页码：311 / 316

页数：6

共 49 条

[1]

Agrawal P., 2007, 2007 IEEE INT S WORL

[2]

Akselrod D., 2008, 2008 INT C INF FUS

[3]

Al-Zubaidy H., 2007, 2007 MIL COMM C

[4] Optimal Scheduling in High-Speed Downlink Packet Access Networks [J].

Al-Zubaidy, Hussein ;

Lambadaris, Ioannis ;

Talim, Jerome .

ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2010, 21 (01)

[5] The optimal timing of living-donor liver transplantation [J].

Alagoz, O ;

Maillart, LM ;

Schaefer, AJ ;

Roberts, MS .

MANAGEMENT SCIENCE, 2004, 50 (10) :1420-1430

[6] Choosing among living-donor and cadaveric livers [J].

Alagoz, Oguzhan ;

Maillart, Lisa M. ;

Schaefer, Andrew J. ;

Roberts, Mark S. .

MANAGEMENT SCIENCE, 2007, 53 (11) :1702-1715

[7] Determining the acceptance of cadaveric livers using an implicit model of the waiting list [J].

Alagoz, Oguzhan ;

Maillart, Lisa M. ;

Schaefer, Andrew J. ;

Roberts, Mark S. .

OPERATIONS RESEARCH, 2007, 55 (01) :24-36

[8]

[Anonymous], 2007, Approximate Dynamic Programming: Solving the Curses of Dimensionality

[9] Approximate dynamic programming via direct search in the space of value function approximations [J].

Arruda, E. F. ;

Fragoso, M. D. ;

do Val, J. B. R. .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2011, 211 (02) :343-351

[10]

Asadian A, 2010, P AMER CONTR CONF, P2785

← 1 2 3 4 5 →