Deriving Explicit Control Policies for Markov Decision Processes Using Symbolic Regression

被引:2
|
作者
Hristov, A. [1 ]
Bosman, J. W. [1 ]
Bhulai, S. [2 ]
van der Mei, R. D. [1 ]
机构
[1] Ctr Math & Comp Sci, Stochast Grp, Amsterdam, Netherlands
[2] Vrije Univ Amsterdam, Dept Math, Amsterdam, Netherlands
来源
PROCEEDINGS OF THE 13TH EAI INTERNATIONAL CONFERENCE ON PERFORMANCE EVALUATION METHODOLOGIES AND TOOLS ( VALUETOOLS 2020) | 2020年
关键词
Markov Decision Processes; Genetic program; Symbolic regression; Threshold-type policy; Optimal control; Closedform approximation;
D O I
10.1145/3388831.3388840
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we introduce a novel approach to optimizing the control of systems that can be modeled as Markov decision processes (MDPs) with a threshold-based optimal policy. Our method is based on a specific type of genetic program known as symbolic regression (SR). We present how the performance of this program can be greatly improved by taking into account the corresponding MDP framework in which we apply it. The proposed method has two main advantages: (1) it results in near-optimal decision policies, and (2) in contrast to other algorithms, it generates closed-form approximations. Obtaining an explicit expression for the decision policy gives the opportunity to conduct sensitivity analysis, and allows instant calculation of a new threshold function for any change in the parameters. We emphasize that the introduced technique is highly general and applicable to MDPs that have a threshold-based policy. Extensive experimentation demonstrates the usefulness of the method.
引用
收藏
页码:41 / 47
页数:7
相关论文
共 50 条
  • [1] Markov decision processes based optimal control policies for probabilistic boolean networks
    Abul, O
    Alhajj, R
    Polat, F
    BIBE 2004: FOURTH IEEE SYMPOSIUM ON BIOINFORMATICS AND BIOENGINEERING, PROCEEDINGS, 2004, : 337 - 344
  • [2] Optimal adaptive policies for Markov decision processes
    Burnetas, AN
    Katehakis, MN
    MATHEMATICS OF OPERATIONS RESEARCH, 1997, 22 (01) : 222 - 255
  • [3] Robustness of policies in constrained Markov decision processes
    Zadorojniy, A
    Shwartz, A
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2006, 51 (04) : 635 - 638
  • [4] Optimal control in light traffic Markov decision processes
    Ger Koole
    Olaf Passchier
    Mathematical Methods of Operations Research, 1997, 45 : 63 - 79
  • [5] Optimal control in light traffic Markov decision processes
    Koole, G
    Passchier, O
    MATHEMATICAL METHODS OF OPERATIONS RESEARCH, 1997, 45 (01) : 63 - 79
  • [6] Symbolic algorithms for qualitative analysis of Markov decision processes with Buchi objectives
    Chatterjee, Krishnendu
    Henzinger, Monika
    Joglekar, Manas
    Shah, Nisarg
    FORMAL METHODS IN SYSTEM DESIGN, 2013, 42 (03) : 301 - 327
  • [7] Learning Robust Policies for Uncertain Parametric Markov Decision Processes
    Rickard, Luke
    Abate, Alessandro
    Margellos, Kostas
    6TH ANNUAL LEARNING FOR DYNAMICS & CONTROL CONFERENCE, 2024, 242 : 876 - 889
  • [8] Learning Parameterized Policies for Markov Decision Processes through Demonstrations
    Hanawal, Manjesh K.
    Liu, Hao
    Zhu, Henghui
    Paschalidis, Ioannis Ch.
    2016 IEEE 55TH CONFERENCE ON DECISION AND CONTROL (CDC), 2016, : 7087 - 7092
  • [9] On functional equations for Kth best policies in Markov decision processes
    Chang, Hyeong Soo
    AUTOMATICA, 2013, 49 (01) : 297 - 300
  • [10] Control Logic Synthesis for Manufacturing Systems Using Markov Decision Processes
    Lee, Changmin
    Park, Jehyun
    Choi, Jongeun
    Ha, Jaebok
    Lee, Sangyeong
    IFAC PAPERSONLINE, 2021, 54 (20): : 495 - 502