Markov Decision Process Design for Imitation of Optimal Task Schedulers

被引：0

作者：

Rademacher, Paul ^{[1
]}

Wagner, Kevin ^{[2
]}

Smith, Leslie ^{[1
]}

机构：

[1] US Naval Res Lab, Navy Ctr Appl Res AI, Washington, DC 20375 USA

[2] US Naval Res Lab, Div Radar, Washington, DC USA

来源：

2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP | 2023年

关键词：

Scheduling; imitation learning; Markov decision process; tree search;

D O I：

10.1109/SSP53291.2023.10207940

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the generally prohibitive computational requirements of optimal task schedulers, much of the field of task scheduling focuses on designing fast suboptimal algorithms. Since the tree search commonly used by sequencing algorithms such as Branch-and-Bound can naturally be framed as a Markov decision process, designing schedulers using imitation and reinforcement learning is a promising and active area of research. This paper demonstrates how polices can be trained on previously solved scheduling problems and successfully generalize to novel ones. Instead of focusing on policy design, however, this work focuses on designing the Markov decision process observation and reward functions to make learning as effective and efficient as possible. This can be of critical importance when training data is limited or when only simple, fast policies are practical. Various Markov decision process designs are introduced and simulation examples demonstrate the resultant increases in policy performance, even without integration into search algorithms.

引用

页码：56 / 60

页数：5

共 50 条

[41] Minimizing the Outage Probability in a Markov Decision Process
Corlay, Vincent
Sibel, Jean-Christophe
2023 IEEE INFORMATION THEORY WORKSHOP, ITW, 2023, : 107 - 112
[42] Asset Allocation Using Markov Decision Process
Jilani, Tahseen
Zaidi, Faheem
Jamal, Syed Khalid
Maqsood, Arfa
Safdar, Suboohi
INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY, 2019, 19 (03): : 185 - 189
[43] MARKOV DECISION PROCESS FOR SONOBUOY TRANSMISSION SCHEDULING
Suvorova, S.
Fletcher, F.
Angley, D.
Gaetjens, H.
Simakov, S.
Morelande, M.
Moran, B.
2016 19TH INTERNATIONAL CONFERENCE ON INFORMATION FUSION (FUSION), 2016, : 2155 - 2162
[44] Optimal preventive policies for parallel systems using Markov decision process: application to an offshore power plant
Machado, Mario Marcondes
Silva, Thiago Lima
Camponogara, Eduardo
de Arruda, Edilson Fernandes
Ferreira Filho, Virgilio Jose Martins
EURO JOURNAL ON DECISION PROCESSES, 2023, 11
[45] An Optimal Life Cycle Reprofiling Strategy of Train Wheels Based on Markov Decision Process of Wheel Degradation
Zeng, Yuanchen
Song, Dongli
Zhang, Weihua
Zhou, Bin
Xie, Mingyuan
Tang, Xu
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (08) : 10354 - 10364
[46] Markov decision process framework of optimal energy dispatch in a smart data center with uninterruptible power supplies
Cen, Haifeng
Xu, Yuan
Sun, Kaiyuan
Tian, Hao
Chen, Kun
Lin, Lin
JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2024, 24 (03) : 1317 - 1329
[47] Optimal Probabilistic Policy for Dynamic Resource Activation Using Markov Decision Process in Green Wireless Networks
Kong, Peng-Yong
IEEE TRANSACTIONS ON MOBILE COMPUTING, 2014, 13 (10) : 2357 - 2368
[48] Ship design evaluation subject to carbon emission policymaking using a Markov decision process framework
Niese, Nathan D.
Kana, Austin A.
Singer, David J.
OCEAN ENGINEERING, 2015, 106 : 371 - 385
[49] A Markov Decision Process with Awareness and Present Bias in Decision-Making
Bizzarri, Federico
Mocenni, Chiara
Tiezzi, Silvia
MATHEMATICS, 2023, 11 (11)
[50] Game Theoretic Markov Decision Processes for Optimal Decision Making in Social Systems
Chen, Yan
Gao, Yang
Jiang, Chunxiao
Liu, K. J. Ray
2014 IEEE GLOBAL CONFERENCE ON SIGNAL AND INFORMATION PROCESSING (GLOBALSIP), 2014, : 268 - 272

← 1 2 3 4 5 →