Deriving Explicit Control Policies for Markov Decision Processes Using Symbolic Regression

被引:2
作者
Hristov, A. [1 ]
Bosman, J. W. [1 ]
Bhulai, S. [2 ]
van der Mei, R. D. [1 ]
机构
[1] Ctr Math & Comp Sci, Stochast Grp, Amsterdam, Netherlands
[2] Vrije Univ Amsterdam, Dept Math, Amsterdam, Netherlands
来源
PROCEEDINGS OF THE 13TH EAI INTERNATIONAL CONFERENCE ON PERFORMANCE EVALUATION METHODOLOGIES AND TOOLS ( VALUETOOLS 2020) | 2020年
关键词
Markov Decision Processes; Genetic program; Symbolic regression; Threshold-type policy; Optimal control; Closedform approximation;
D O I
10.1145/3388831.3388840
中图分类号
TP31 [计算机软件];
学科分类号
081202 ; 0835 ;
摘要
In this paper, we introduce a novel approach to optimizing the control of systems that can be modeled as Markov decision processes (MDPs) with a threshold-based optimal policy. Our method is based on a specific type of genetic program known as symbolic regression (SR). We present how the performance of this program can be greatly improved by taking into account the corresponding MDP framework in which we apply it. The proposed method has two main advantages: (1) it results in near-optimal decision policies, and (2) in contrast to other algorithms, it generates closed-form approximations. Obtaining an explicit expression for the decision policy gives the opportunity to conduct sensitivity analysis, and allows instant calculation of a new threshold function for any change in the parameters. We emphasize that the introduced technique is highly general and applicable to MDPs that have a threshold-based policy. Extensive experimentation demonstrates the usefulness of the method.
引用
收藏
页码:41 / 47
页数:7
相关论文
共 50 条
  • [21] Thermal image colorization using Markov decision processes
    Xiaojing Gu
    Mengchi He
    Xingsheng Gu
    Memetic Computing, 2017, 9 : 15 - 22
  • [22] Thermal image colorization using Markov decision processes
    Gu, Xiaojing
    He, Mengchi
    Gu, Xingsheng
    MEMETIC COMPUTING, 2017, 9 (01) : 15 - 22
  • [23] Allocating services to applications using Markov decision processes
    Bannazadeh, Hadi
    Leon-Garcia, Alberto
    IEEE INTERNATIONAL CONFERENCE ON SERVICE-ORIENTED COMPUTING AND APPLICATIONS, PROCEEDINGS, 2007, : 141 - +
  • [24] NEURAL DECODING SYSTEMS USING MARKOV DECISION PROCESSES
    Dantas, Henrique
    Mathews, V. John
    Wendelken, Suzanne M.
    Davis, Tyler S.
    Clark, Gregory A.
    Warren, David J.
    2017 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP), 2017, : 974 - 978
  • [25] On the adaptive control of a class of partially observed Markov decision processes
    Hsu, Shun-Pin
    Arapostathis, Ari
    JOURNAL OF MATHEMATICAL ANALYSIS AND APPLICATIONS, 2011, 380 (01) : 1 - 9
  • [26] On Exact Embedding Framework for Optimal Control of Markov Decision Processes
    Kharade, Sonam
    Sutavani, Sarang
    Yerudkar, Amol
    Wagh, Sushama
    Liu, Yang
    Del Vecchio, Carmen
    Singh, N. M.
    IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2024, 69 (02) : 1316 - 1323
  • [27] Switching control in multi-mode Markov decision processes
    Ren, ZY
    Krogh, BH
    PROCEEDINGS OF THE 40TH IEEE CONFERENCE ON DECISION AND CONTROL, VOLS 1-5, 2001, : 2095 - 2101
  • [28] Markov decision processes approximation with coupled dynamics via Markov deterministic control systems
    Portillo-Ramirez, Gustavo
    Cruz-Suarez, Hugo
    Lopez-Rios, Ruy
    Blancas-Rivera, Ruben
    OPEN MATHEMATICS, 2023, 21 (01):
  • [29] Detection-averse optimal and receding-horizon control for Markov decision processes
    Li, Nan
    Kolmanovsky, Ilya
    Girard, Anouck
    AUTOMATICA, 2020, 122
  • [30] Learning parametric policies and transition probability models of markov decision processes from data
    Xu, Tingting
    Zhu, Henghui
    Paschalidis, Ioannis Ch
    EUROPEAN JOURNAL OF CONTROL, 2021, 57 : 68 - 75