Preference learning based deep reinforcement learning for flexible job shop scheduling problem

被引:0
作者
Liu, Xinning [1 ]
Han, Li [1 ]
Kang, Ling [2 ]
Liu, Jiannan [1 ]
Miao, Huadong [3 ]
机构
[1] Dalian Neusoft Univ Informat, Sch Comp & Software, Dalian 116023, Liaoning, Peoples R China
[2] Dalian Neusoft Univ Informat, Neusoft Res Inst, Dalian 116023, Liaoning, Peoples R China
[3] SNOW China Beijing Co Ltd, Dalian Branch, Dalian 116023, Liaoning, Peoples R China
关键词
Flexible job shop scheduling problem; Preference learning; Proximal policy optimization; Deep reinforcement learning; BENCHMARKS; ALGORITHM;
D O I
10.1007/s40747-024-01772-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The flexible job shop scheduling problem (FJSP) holds significant importance in both theoretical research and practical applications. Given the complexity and diversity of FJSP, improving the generalization and quality of scheduling methods has become a hot topic of interest in both industry and academia. To address this, this paper proposes a Preference-Based Mask-PPO (PBMP) algorithm, which leverages the strengths of preference learning and invalid action masking to optimize FJSP solutions. First, a reward predictor based on preference learning is designed to model reward prediction by comparing random fragments, eliminating the need for complex reward function design. Second, a novel intelligent switching mechanism is introduced, where proximal policy optimization (PPO) is employed to enhance exploration during sampling, and masked proximal policy optimization (Mask-PPO) refines the action space during training, significantly improving efficiency and solution quality. Furthermore, the Pearson correlation coefficient (PCC) is used to evaluate the performance of the preference model. Finally, comparative experiments on FJSP benchmark instances of varying sizes demonstrate that PBMP outperforms traditional scheduling strategies such as dispatching rules, OR-Tools, and other deep reinforcement learning (DRL) algorithms, achieving superior scheduling policies and faster convergence. Even with increasing instance sizes, preference learning proves to be an effective reward mechanism in reinforcement learning for FJSP. The ablation study further highlights the advantages of each key component in the PBMP algorithm across performance metrics.
引用
收藏
页数:23
相关论文
共 50 条
  • [21] Deep reinforcement learning for solving the joint scheduling problem of machines and AGVs in job shop
    Sun A.-H.
    Lei Q.
    Song Y.-C.
    Yang Y.-F.
    Lei, Qi (leiqi@cqu.edu.cn), 1600, Northeast University (39): : 253 - 262
  • [22] Scheduling for the Flexible Job-Shop Problem with a Dynamic Number of Machines Using Deep Reinforcement Learning
    Chang, Yu-Hung
    Liu, Chien-Hung
    You, Shingchern D.
    INFORMATION, 2024, 15 (02)
  • [23] A discrete event simulator to implement deep reinforcement learning for the dynamic flexible job shop scheduling problem
    Tiacci, Lorenzo
    Rossi, Andrea
    SIMULATION MODELLING PRACTICE AND THEORY, 2024, 134
  • [24] A multi-action deep reinforcement learning framework for flexible Job-shop scheduling problem
    Lei, Kun
    Guo, Peng
    Zhao, Wenchao
    Wang, Yi
    Qian, Linmao
    Meng, Xiangyin
    Tang, Liansheng
    EXPERT SYSTEMS WITH APPLICATIONS, 2022, 205
  • [25] A spatial pyramid pooling-based deep reinforcement learning model for dynamic job-shop scheduling problem
    Wu, Xinquan
    Yan, Xuefeng
    COMPUTERS & OPERATIONS RESEARCH, 2023, 160
  • [26] Deep reinforcement learning for dynamic distributed job shop scheduling problem with transfers
    Lei, Yong
    Deng, Qianwang
    Liao, Mengqi
    Gao, Shuocheng
    EXPERT SYSTEMS WITH APPLICATIONS, 2024, 251
  • [27] A novel method for solving dynamic flexible job-shop scheduling problem via DIFFormer and deep reinforcement learning
    Wan, Lanjun
    Cui, Xueyan
    Zhao, Haoxin
    Fu, Long
    Li, Changyun
    COMPUTERS & INDUSTRIAL ENGINEERING, 2024, 198
  • [28] Deep reinforcement learning for adaptive flexible job shop scheduling: coping with variability and uncertainty
    Workneh, Abebaw Degu
    El Mouhtadi, Meryam
    Alaoui, Ahmed El Hilali
    SMART SCIENCE, 2024, : 387 - 405
  • [29] End-to-End Multitarget Flexible Job Shop Scheduling With Deep Reinforcement Learning
    Wang, Rongkai
    Jing, Yiyang
    Gu, Chaojie
    He, Shibo
    Chen, Jiming
    IEEE INTERNET OF THINGS JOURNAL, 2025, 12 (04): : 4420 - 4434
  • [30] Energy-Flexible Job-Shop Scheduling Using Deep Reinforcement Learning
    Felder, Mine
    Steiner, Daniel
    Busch, Paul
    Trat, Martin
    Sun, Chenwei
    Bender, Janek
    Ovtcharova, Jivka
    PROCEEDINGS OF THE CONFERENCE ON PRODUCTION SYSTEMS AND LOGISTICS, CPSL 2023-1, 2023, : 353 - 362