Markov Decision Process Design for Imitation of Optimal Task Schedulers

被引:0
|
作者
Rademacher, Paul [1 ]
Wagner, Kevin [2 ]
Smith, Leslie [1 ]
机构
[1] US Naval Res Lab, Navy Ctr Appl Res AI, Washington, DC 20375 USA
[2] US Naval Res Lab, Div Radar, Washington, DC USA
来源
2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP | 2023年
关键词
Scheduling; imitation learning; Markov decision process; tree search;
D O I
10.1109/SSP53291.2023.10207940
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the generally prohibitive computational requirements of optimal task schedulers, much of the field of task scheduling focuses on designing fast suboptimal algorithms. Since the tree search commonly used by sequencing algorithms such as Branch-and-Bound can naturally be framed as a Markov decision process, designing schedulers using imitation and reinforcement learning is a promising and active area of research. This paper demonstrates how polices can be trained on previously solved scheduling problems and successfully generalize to novel ones. Instead of focusing on policy design, however, this work focuses on designing the Markov decision process observation and reward functions to make learning as effective and efficient as possible. This can be of critical importance when training data is limited or when only simple, fast policies are practical. Various Markov decision process designs are introduced and simulation examples demonstrate the resultant increases in policy performance, even without integration into search algorithms.
引用
收藏
页码:56 / 60
页数:5
相关论文
共 50 条
  • [21] Forecasting design and decision paths in ship design using the ship-centric Markov decision process model
    Kana, Austin A.
    OCEAN ENGINEERING, 2017, 137 : 328 - 337
  • [22] Episodic task learning in Markov decision processes
    Yong Lin
    Fillia Makedon
    Yurong Xu
    Artificial Intelligence Review, 2011, 36 : 87 - 98
  • [23] Episodic task learning in Markov decision processes
    Lin, Yong
    Makedon, Fillia
    Xu, Yurong
    ARTIFICIAL INTELLIGENCE REVIEW, 2011, 36 (02) : 87 - 98
  • [24] Structural results on optimal transmission scheduling over dynamical fading channels: A Constrained Markov Decision Process approach
    Djonin, Dejan V.
    Krishnamurthy, Vikram
    WIRELESS COMMUNICATIONS, 2007, 143 : 75 - +
  • [25] Markov Decision Process for Modeling Social Engineering Attacks and Finding Optimal Attack Strategies
    Abri, Faranak
    Zheng, Jianjun
    Namin, Akbar Siami
    Jones, Keith S.
    IEEE ACCESS, 2022, 10 : 109949 - 109968
  • [26] A Markov Decision Process-based Optimal Vehicle Scheduling Model for Supply Chains
    Zhao, Zhenjun
    Alqahtani, Fayez
    Almakhadmeh, Zafer
    JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS, 2025,
  • [27] A Markov decision process framework for optimal operation of monitored multi-state systems
    Compare, Michele
    Marelli, Paolo
    Baraldi, Piero
    Zio, Enrico
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART O-JOURNAL OF RISK AND RELIABILITY, 2018, 232 (06) : 677 - 689
  • [28] Optimal maintenance policies for degrading hydrocarbon pipelines using Markov decision process.
    Bediako, Eric
    Alaswad, Suzan
    Xiang, Yisha
    Tian, Zhigang
    2020 ASIA-PACIFIC INTERNATIONAL SYMPOSIUM ON ADVANCED RELIABILITY AND MAINTENANCE MODELING (APARM), 2020,
  • [29] Design of dynamic career path recommendation system based on Markov decision process
    Jiang, Yang
    JOURNAL OF COMPUTATIONAL METHODS IN SCIENCES AND ENGINEERING, 2025,
  • [30] Markov Decision Process for imbalanced classification
    Xuan, Chunyu
    Yang, Jing
    Jiang, Zhou
    Zhang, Dong
    2022 IEEE 17TH CONFERENCE ON INDUSTRIAL ELECTRONICS AND APPLICATIONS (ICIEA), 2022, : 27 - 32