Markov Decision Process Design for Imitation of Optimal Task Schedulers

被引:0
作者
Rademacher, Paul [1 ]
Wagner, Kevin [2 ]
Smith, Leslie [1 ]
机构
[1] US Naval Res Lab, Navy Ctr Appl Res AI, Washington, DC 20375 USA
[2] US Naval Res Lab, Div Radar, Washington, DC USA
来源
2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP | 2023年
关键词
Scheduling; imitation learning; Markov decision process; tree search;
D O I
10.1109/SSP53291.2023.10207940
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the generally prohibitive computational requirements of optimal task schedulers, much of the field of task scheduling focuses on designing fast suboptimal algorithms. Since the tree search commonly used by sequencing algorithms such as Branch-and-Bound can naturally be framed as a Markov decision process, designing schedulers using imitation and reinforcement learning is a promising and active area of research. This paper demonstrates how polices can be trained on previously solved scheduling problems and successfully generalize to novel ones. Instead of focusing on policy design, however, this work focuses on designing the Markov decision process observation and reward functions to make learning as effective and efficient as possible. This can be of critical importance when training data is limited or when only simple, fast policies are practical. Various Markov decision process designs are introduced and simulation examples demonstrate the resultant increases in policy performance, even without integration into search algorithms.
引用
收藏
页码:56 / 60
页数:5
相关论文
共 50 条
  • [41] Minimizing the Outage Probability in a Markov Decision Process
    Corlay, Vincent
    Sibel, Jean-Christophe
    2023 IEEE INFORMATION THEORY WORKSHOP, ITW, 2023, : 107 - 112
  • [42] Asset Allocation Using Markov Decision Process
    Jilani, Tahseen
    Zaidi, Faheem
    Jamal, Syed Khalid
    Maqsood, Arfa
    Safdar, Suboohi
    INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2019, 19 (03): : 185 - 189
  • [43] MARKOV DECISION PROCESS FOR SONOBUOY TRANSMISSION SCHEDULING
    Suvorova, S.
    Fletcher, F.
    Angley, D.
    Gaetjens, H.
    Simakov, S.
    Morelande, M.
    Moran, B.
    2016 19TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2016, : 2155 - 2162
  • [44] Optimal preventive policies for parallel systems using Markov decision process: application to an offshore power plant
    Machado, Mario Marcondes
    Silva, Thiago Lima
    Camponogara, Eduardo
    de Arruda, Edilson Fernandes
    Ferreira Filho, Virgilio Jose Martins
    EURO JOURNAL ON DECISION PROCESSES, 2023, 11
  • [45] An Optimal Life Cycle Reprofiling Strategy of Train Wheels Based on Markov Decision Process of Wheel Degradation
    Zeng, Yuanchen
    Song, Dongli
    Zhang, Weihua
    Zhou, Bin
    Xie, Mingyuan
    Tang, Xu
    IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 10354 - 10364
  • [46] Markov decision process framework of optimal energy dispatch in a smart data center with uninterruptible power supplies
    Cen, Haifeng
    Xu, Yuan
    Sun, Kaiyuan
    Tian, Hao
    Chen, Kun
    Lin, Lin
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2024, 24 (03) : 1317 - 1329
  • [47] Optimal Probabilistic Policy for Dynamic Resource Activation Using Markov Decision Process in Green Wireless Networks
    Kong, Peng-Yong
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2014, 13 (10) : 2357 - 2368
  • [48] Ship design evaluation subject to carbon emission policymaking using a Markov decision process framework
    Niese, Nathan D.
    Kana, Austin A.
    Singer, David J.
    OCEAN ENGINEERING, 2015, 106 : 371 - 385
  • [49] A Markov Decision Process with Awareness and Present Bias in Decision-Making
    Bizzarri, Federico
    Mocenni, Chiara
    Tiezzi, Silvia
    MATHEMATICS, 2023, 11 (11)
  • [50] Game Theoretic Markov Decision Processes for Optimal Decision Making in Social Systems
    Chen, Yan
    Gao, Yang
    Jiang, Chunxiao
    Liu, K. J. Ray
    2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 268 - 272