Markov Decision Process Design for Imitation of Optimal Task Schedulers

被引:0
作者
Rademacher, Paul [1 ]
Wagner, Kevin [2 ]
Smith, Leslie [1 ]
机构
[1] US Naval Res Lab, Navy Ctr Appl Res AI, Washington, DC 20375 USA
[2] US Naval Res Lab, Div Radar, Washington, DC USA
来源
2023 IEEE STATISTICAL SIGNAL PROCESSING WORKSHOP, SSP | 2023年
关键词
Scheduling; imitation learning; Markov decision process; tree search;
D O I
10.1109/SSP53291.2023.10207940
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Due to the generally prohibitive computational requirements of optimal task schedulers, much of the field of task scheduling focuses on designing fast suboptimal algorithms. Since the tree search commonly used by sequencing algorithms such as Branch-and-Bound can naturally be framed as a Markov decision process, designing schedulers using imitation and reinforcement learning is a promising and active area of research. This paper demonstrates how polices can be trained on previously solved scheduling problems and successfully generalize to novel ones. Instead of focusing on policy design, however, this work focuses on designing the Markov decision process observation and reward functions to make learning as effective and efficient as possible. This can be of critical importance when training data is limited or when only simple, fast policies are practical. Various Markov decision process designs are introduced and simulation examples demonstrate the resultant increases in policy performance, even without integration into search algorithms.
引用
收藏
页码:56 / 60
页数:5
相关论文
共 50 条
  • [31] Partially Observable Markov Decision Process Approximations for Adaptive Sensing
    Edwin K. P. Chong
    Christopher M. Kreucher
    Alfred O. Hero
    Discrete Event Dynamic Systems, 2009, 19 : 377 - 422
  • [32] Partially Observable Markov Decision Process Approximations for Adaptive Sensing
    Chong, Edwin K. P.
    Kreucher, Christopher M.
    Hero, Alfred O., III
    DISCRETE EVENT DYNAMIC SYSTEMS-THEORY AND APPLICATIONS, 2009, 19 (03): : 377 - 422
  • [33] Optimal forest management under financial risk aversion with discounted Markov decision process models
    Zhou, Mo
    Buongiorno, Joseph
    CANADIAN JOURNAL OF FOREST RESEARCH, 2019, 49 (07) : 802 - 809
  • [34] Sensitivity analysis for the optimal minimal repair/replacement policies under the framework of Markov decision process
    Chen, Mingchih
    Cheng, Chun-Yuan
    2007 IEEE INTERNATIONAL CONFERENCE ON INDUSTRIAL ENGINEERING AND ENGINEERING MANAGEMENT, VOLS 1-4, 2007, : 640 - 644
  • [35] A novel method for optimal test sequencing under unreliable test based on Markov Decision Process
    Liang, Yajun
    Xiao, Mingqing
    Tang, Xilang
    Ge, Yawei
    Wang, Xiaofei
    JOURNAL OF INTELLIGENT & FUZZY SYSTEMS, 2018, 35 (03) : 3605 - 3613
  • [36] PTMB: An online satellite task scheduling framework based on pre-trained Markov decision process for multi-task scenario
    Li, Guohao
    Li, Xuefei
    Li, Jing
    Chen, Jia
    Shen, Xin
    KNOWLEDGE-BASED SYSTEMS, 2024, 284
  • [37] Optimizing Maintenance Decision in Rails: A Markov Decision Process Approach
    Sancho, Luis C. B.
    Braga, Joaquim A. P.
    Andrade, Antonio R.
    ASCE-ASME JOURNAL OF RISK AND UNCERTAINTY IN ENGINEERING SYSTEMS PART A-CIVIL ENGINEERING, 2021, 7 (01):
  • [38] A Markov decision process for response adaptive designs
    Yi, Yanqing
    Wang, Xikui
    ECONOMETRICS AND STATISTICS, 2023, 25 : 125 - 133
  • [39] Reinforcement Learning to Rank with Markov Decision Process
    Wei, Zeng
    Xu, Jun
    Lan, Yanyan
    Guo, Jiafeng
    Cheng, Xueqi
    SIGIR'17: PROCEEDINGS OF THE 40TH INTERNATIONAL ACM SIGIR CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION RETRIEVAL, 2017, : 945 - 948
  • [40] Abstractive Meeting Summarization as a Markov Decision Process
    Murray, Gabriel
    ADVANCES IN ARTIFICIAL INTELLIGENCE (AI 2015), 2015, 9091 : 212 - 219