Markov Decision Process Design for Imitation of Optimal Task Schedulers

被引：0

作者：

Rademacher, Paul ^{[1
]}

Wagner, Kevin ^{[2
]}

Smith, Leslie ^{[1
]}

机构：

[1] US Naval Res Lab, Navy Ctr Appl Res AI, Washington, DC 20375 USA

[2] US Naval Res Lab, Div Radar, Washington, DC USA

来源：

2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP | 2023年

关键词：

Scheduling; imitation learning; Markov decision process; tree search;

D O I：

10.1109/SSP53291.2023.10207940

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the generally prohibitive computational requirements of optimal task schedulers, much of the field of task scheduling focuses on designing fast suboptimal algorithms. Since the tree search commonly used by sequencing algorithms such as Branch-and-Bound can naturally be framed as a Markov decision process, designing schedulers using imitation and reinforcement learning is a promising and active area of research. This paper demonstrates how polices can be trained on previously solved scheduling problems and successfully generalize to novel ones. Instead of focusing on policy design, however, this work focuses on designing the Markov decision process observation and reward functions to make learning as effective and efficient as possible. This can be of critical importance when training data is limited or when only simple, fast policies are practical. Various Markov decision process designs are introduced and simulation examples demonstrate the resultant increases in policy performance, even without integration into search algorithms.

引用

页码：56 / 60

页数：5

共 50 条

[21] Forecasting design and decision paths in ship design using the ship-centric Markov decision process model
Kana, Austin A.
OCEAN ENGINEERING, 2017, 137 : 328 - 337
[22] Episodic task learning in Markov decision processes
Yong Lin
Fillia Makedon
Yurong Xu
Artificial Intelligence Review, 2011, 36 : 87 - 98
[23] Episodic task learning in Markov decision processes
Lin, Yong
Makedon, Fillia
Xu, Yurong
ARTIFICIAL INTELLIGENCE REVIEW, 2011, 36 (02) : 87 - 98
[24] Structural results on optimal transmission scheduling over dynamical fading channels: A Constrained Markov Decision Process approach
Djonin, Dejan V.
Krishnamurthy, Vikram
WIRELESS COMMUNICATIONS, 2007, 143 : 75 - +
[25] Markov Decision Process for Modeling Social Engineering Attacks and Finding Optimal Attack Strategies
Abri, Faranak
Zheng, Jianjun
Namin, Akbar Siami
Jones, Keith S.
IEEE ACCESS, 2022, 10 : 109949 - 109968
[26] A Markov Decision Process-based Optimal Vehicle Scheduling Model for Supply Chains
Zhao, Zhenjun
Alqahtani, Fayez
Almakhadmeh, Zafer
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2025,
[27] A Markov decision process framework for optimal operation of monitored multi-state systems
Compare, Michele
Marelli, Paolo
Baraldi, Piero
Zio, Enrico
PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART O-JOURNAL OF RISK AND RELIABILITY, 2018, 232 (06) : 677 - 689
[28] Optimal maintenance policies for degrading hydrocarbon pipelines using Markov decision process.
Bediako, Eric
Alaswad, Suzan
Xiang, Yisha
Tian, Zhigang
2020 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON ADVANCED RELIABILITY AND MAINTENANCE MODELING (APARM), 2020,
[29] Design of dynamic career path recommendation system based on Markov decision process
Jiang, Yang
JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2025,
[30] Markov Decision Process for imbalanced classification
Xuan, Chunyu
Yang, Jing
Jiang, Zhou
Zhang, Dong
2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 27 - 32

← 1 2 3 4 5 →