Markov Decision Process Design for Imitation of Optimal Task Schedulers

被引：0

作者：

Rademacher, Paul ^{[1
]}

Wagner, Kevin ^{[2
]}

Smith, Leslie ^{[1
]}

机构：

[1] US Naval Res Lab, Navy Ctr Appl Res AI, Washington, DC 20375 USA

[2] US Naval Res Lab, Div Radar, Washington, DC USA

来源：

2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP | 2023年

关键词：

Scheduling; imitation learning; Markov decision process; tree search;

D O I：

10.1109/SSP53291.2023.10207940

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Due to the generally prohibitive computational requirements of optimal task schedulers, much of the field of task scheduling focuses on designing fast suboptimal algorithms. Since the tree search commonly used by sequencing algorithms such as Branch-and-Bound can naturally be framed as a Markov decision process, designing schedulers using imitation and reinforcement learning is a promising and active area of research. This paper demonstrates how polices can be trained on previously solved scheduling problems and successfully generalize to novel ones. Instead of focusing on policy design, however, this work focuses on designing the Markov decision process observation and reward functions to make learning as effective and efficient as possible. This can be of critical importance when training data is limited or when only simple, fast policies are practical. Various Markov decision process designs are introduced and simulation examples demonstrate the resultant increases in policy performance, even without integration into search algorithms.

引用

页码：56 / 60

页数：5

共 50 条

[31] Partially Observable Markov Decision Process Approximations for Adaptive Sensing
Edwin K. P. Chong
Christopher M. Kreucher
Alfred O. Hero
Discrete Event Dynamic Systems, 2009, 19 : 377 - 422
[32] Partially Observable Markov Decision Process Approximations for Adaptive Sensing
Chong, Edwin K. P.
Kreucher, Christopher M.
Hero, Alfred O., III
DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2009, 19 (03): : 377 - 422
[33] Optimal forest management under financial risk aversion with discounted Markov decision process models
Zhou, Mo
Buongiorno, Joseph
CANADIAN JOURNAL OF FOREST RESEARCH, 2019, 49 (07) : 802 - 809
[34] Sensitivity analysis for the optimal minimal repair/replacement policies under the framework of Markov decision process
Chen, Mingchih
Cheng, Chun-Yuan
2007 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1-4, 2007, : 640 - 644
[35] A novel method for optimal test sequencing under unreliable test based on Markov Decision Process
Liang, Yajun
Xiao, Mingqing
Tang, Xilang
Ge, Yawei
Wang, Xiaofei
JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (03) : 3605 - 3613
[36] PTMB: An online satellite task scheduling framework based on pre-trained Markov decision process for multi-task scenario
Li, Guohao
Li, Xuefei
Li, Jing
Chen, Jia
Shen, Xin
KNOWLEDGE-BASED SYSTEMS, 2024, 284
[37] Optimizing Maintenance Decision in Rails: A Markov Decision Process Approach
Sancho, Luis C. B.
Braga, Joaquim A. P.
Andrade, Antonio R.
ASCE-ASME JOURNAL OF RISK AND UNCERTAINTY IN ENGINEERING SYSTEMS PART A-CIVIL ENGINEERING, 2021, 7 (01):
[38] A Markov decision process for response adaptive designs
Yi, Yanqing
Wang, Xikui
ECONOMETRICS AND STATISTICS, 2023, 25 : 125 - 133
[39] Reinforcement Learning to Rank with Markov Decision Process
Wei, Zeng
Xu, Jun
Lan, Yanyan
Guo, Jiafeng
Cheng, Xueqi
SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 945 - 948
[40] Abstractive Meeting Summarization as a Markov Decision Process
Murray, Gabriel
ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015), 2015, 9091 : 212 - 219

← 1 2 3 4 5 →