Deriving Explicit Control Policies for Markov Decision Processes Using Symbolic Regression

被引：2

作者：

Hristov, A. ^{[1
]}

Bosman, J. W. ^{[1
]}

Bhulai, S. ^{[2
]}

van der Mei, R. D. ^{[1
]}

机构：

[1] Ctr Math & Comp Sci, Stochast Grp, Amsterdam, Netherlands

[2] Vrije Univ Amsterdam, Dept Math, Amsterdam, Netherlands

来源：

PROCEEDINGS OF THE 13TH EAI INTERNATIONAL CONFERENCE ON PERFORMANCE EVALUATION METHODOLOGIES AND TOOLS ( VALUETOOLS 2020) | 2020年

关键词：

Markov Decision Processes; Genetic program; Symbolic regression; Threshold-type policy; Optimal control; Closedform approximation;

D O I：

10.1145/3388831.3388840

中图分类号：

TP31 [计算机软件];

学科分类号：

081202 ; 0835 ;

摘要：

In this paper, we introduce a novel approach to optimizing the control of systems that can be modeled as Markov decision processes (MDPs) with a threshold-based optimal policy. Our method is based on a specific type of genetic program known as symbolic regression (SR). We present how the performance of this program can be greatly improved by taking into account the corresponding MDP framework in which we apply it. The proposed method has two main advantages: (1) it results in near-optimal decision policies, and (2) in contrast to other algorithms, it generates closed-form approximations. Obtaining an explicit expression for the decision policy gives the opportunity to conduct sensitivity analysis, and allows instant calculation of a new threshold function for any change in the parameters. We emphasize that the introduced technique is highly general and applicable to MDPs that have a threshold-based policy. Extensive experimentation demonstrates the usefulness of the method.

引用

页码：41 / 47

页数：7

共 50 条

[1] Markov decision processes based optimal control policies for probabilistic boolean networks
Abul, O
Alhajj, R
Polat, F
BIBE 2004: FOURTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2004, : 337 - 344
[2] Optimal adaptive policies for Markov decision processes
Burnetas, AN
Katehakis, MN
MATHEMATICS OF OPERATIONS RESEARCH, 1997, 22 (01) : 222 - 255
[3] Robustness of policies in constrained Markov decision processes
Zadorojniy, A
Shwartz, A
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (04) : 635 - 638
[4] Optimal control in light traffic Markov decision processes
Ger Koole
Olaf Passchier
Mathematical Methods of Operations Research, 1997, 45 : 63 - 79
[5] Optimal control in light traffic Markov decision processes
Koole, G
Passchier, O
MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 1997, 45 (01) : 63 - 79
[6] Symbolic algorithms for qualitative analysis of Markov decision processes with Buchi objectives
Chatterjee, Krishnendu
Henzinger, Monika
Joglekar, Manas
Shah, Nisarg
FORMAL METHODS IN SYSTEM DESIGN, 2013, 42 (03) : 301 - 327
[7] Learning Robust Policies for Uncertain Parametric Markov Decision Processes
Rickard, Luke
Abate, Alessandro
Margellos, Kostas
6TH ANNUAL LEARNING FOR DYNAMICS & CONTROL CONFERENCE, 2024, 242 : 876 - 889
[8] Learning Parameterized Policies for Markov Decision Processes through Demonstrations
Hanawal, Manjesh K.
Liu, Hao
Zhu, Henghui
Paschalidis, Ioannis Ch.
2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 7087 - 7092
[9] On functional equations for Kth best policies in Markov decision processes
Chang, Hyeong Soo
AUTOMATICA, 2013, 49 (01) : 297 - 300
[10] Control Logic Synthesis for Manufacturing Systems Using Markov Decision Processes
Lee, Changmin
Park, Jehyun
Choi, Jongeun
Ha, Jaebok
Lee, Sangyeong
IFAC PAPERSONLINE, 2021, 54 (20): : 495 - 502

← 1 2 3 4 5 →