Light robustness in the optimization of Markov decision processes with uncertain parameters

被引：6

作者：

Buchholz, Peter ^{[1
]}

Scheftelowitsch, Dimitri ^{[1
]}

机构：

[1] TU Dortmund, Informat 4, D-44221 Dortmund, Germany

来源：

COMPUTERS & OPERATIONS RESEARCH | 2019年 / 108卷

关键词：

Markov decision process; Parameter uncertainty; Robust optimization; Mixed Integer Linear Programming; Nonlinear Programming; PRICE;

D O I：

10.1016/j.cor.2019.04.004

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

Markov decision processes are often specified with limited knowledge of the real behavior or are part of a partially unknown environment such that transition rates and rewards are not exactly known. Different models to describe this uncertainty in a formal way have been proposed. In all cases it is important to consider uncertainty during the computation of optimal control policies. Usually this is done by computing robust solutions which are optimal in the worst realization of uncertainty. However, such solutions tend to be very conservative. In this paper, we develop an approach to mitigate robustness by computing policies that are optimal in a predefined situation, like the average case, but also guarantee a minimal gain in all other situations, including the worst case. We present algorithms based on policy iteration that solve subproblems using Mixed Integer Linear Programming (MILD) or Nonlinear Programming (NLP). (C) 2019 Elsevier Ltd. All rights reserved.

引用

页码：69 / 81

页数：13

共 33 条

[1]

Achterberg T., 2013, Facets of Combinatorial Optimization, P449, DOI [DOI 10.1007/978-3-642-38189-8, DOI 10.1007/978-3-642-38189-8_18, 10.1007/978-3-642-38189-8, 10.1007/978-3-642-38189-8_18]

[2]

[Anonymous], 1994, NONNEGATIVE MATRICES, DOI DOI 10.1137/1.9781611971262

[3]

[Anonymous], 1999, STOCH MODEL SER, DOI 10.1201/9781315140223

[4]

[Anonymous], 2012, Technical University of Denmark, DOI DOI 10.1017/CBO9780511470943.008

[5] Robust convex optimization [J].

Ben-Tal, A ;

Nemirovski, A .

MATHEMATICS OF OPERATIONS RESEARCH, 1998, 23 (04) :769-805

[6] The price of robustness [J].

Bertsimas, D ;

Sim, M .

OPERATIONS RESEARCH, 2004, 52 (01) :35-53

[7]

Buchholz Peter, 2017, Computer Performance Engineering. 14th European Workshop, EPEW 2017. Proceedings: LNCS 10497, P3, DOI 10.1007/978-3-319-66583-2_1

[8] Computation of weighted sums of rewards for concurrent MDPs [J].

Buchholz, Peter ;

Scheftelowitsch, Dimitri .

MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 2019, 89 (01) :1-42

[9] Global optimization of MIQCPs with dynamic piecewise relaxations [J].

Castillo Castillo, Pedro A. ;

Castro, Pedro M. ;

Mahalec, Vladimir .

JOURNAL OF GLOBAL OPTIMIZATION, 2018, 71 (04) :691-716

[10]

Chatterjee K, 2006, LECT NOTES COMPUT SC, V3884, P325

← 1 2 3 4 →