Preference learning based deep reinforcement learning for flexible job shop scheduling problem

被引:0
|
作者
Liu, Xinning [1 ]
Han, Li [1 ]
Kang, Ling [2 ]
Liu, Jiannan [1 ]
Miao, Huadong [3 ]
机构
[1] Dalian Neusoft Univ Informat, Sch Comp & Software, Dalian 116023, Liaoning, Peoples R China
[2] Dalian Neusoft Univ Informat, Neusoft Res Inst, Dalian 116023, Liaoning, Peoples R China
[3] SNOW China Beijing Co Ltd, Dalian Branch, Dalian 116023, Liaoning, Peoples R China
关键词
Flexible job shop scheduling problem; Preference learning; Proximal policy optimization; Deep reinforcement learning; BENCHMARKS; ALGORITHM;
D O I
10.1007/s40747-024-01772-x
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The flexible job shop scheduling problem (FJSP) holds significant importance in both theoretical research and practical applications. Given the complexity and diversity of FJSP, improving the generalization and quality of scheduling methods has become a hot topic of interest in both industry and academia. To address this, this paper proposes a Preference-Based Mask-PPO (PBMP) algorithm, which leverages the strengths of preference learning and invalid action masking to optimize FJSP solutions. First, a reward predictor based on preference learning is designed to model reward prediction by comparing random fragments, eliminating the need for complex reward function design. Second, a novel intelligent switching mechanism is introduced, where proximal policy optimization (PPO) is employed to enhance exploration during sampling, and masked proximal policy optimization (Mask-PPO) refines the action space during training, significantly improving efficiency and solution quality. Furthermore, the Pearson correlation coefficient (PCC) is used to evaluate the performance of the preference model. Finally, comparative experiments on FJSP benchmark instances of varying sizes demonstrate that PBMP outperforms traditional scheduling strategies such as dispatching rules, OR-Tools, and other deep reinforcement learning (DRL) algorithms, achieving superior scheduling policies and faster convergence. Even with increasing instance sizes, preference learning proves to be an effective reward mechanism in reinforcement learning for FJSP. The ablation study further highlights the advantages of each key component in the PBMP algorithm across performance metrics.
引用
收藏
页数:23
相关论文
共 50 条
  • [1] A DEEP REINFORCEMENT LEARNING BASED SOLUTION FOR FLEXIBLE JOB SHOP SCHEDULING PROBLEM
    Han, B. A.
    Yang, J. J.
    INTERNATIONAL JOURNAL OF SIMULATION MODELLING, 2021, 20 (02) : 375 - 386
  • [2] Deep reinforcement learning for flexible assembly job shop scheduling problem
    Hu Y.
    Zhang L.
    Bai X.
    Tang Q.
    Huazhong Keji Daxue Xuebao (Ziran Kexue Ban)/Journal of Huazhong University of Science and Technology (Natural Science Edition), 2023, 51 (02): : 153 - 160
  • [3] Job Shop Scheduling Problem Based on Deep Reinforcement Learning
    Li, Baoshuai
    Ye, Chunming
    Computer Engineering and Applications, 2024, 57 (23) : 248 - 254
  • [4] Dynamic flexible job shop scheduling based on deep reinforcement learning
    Yang, Dan
    Shu, Xiantao
    Yu, Zhen
    Lu, Guangtao
    Ji, Songlin
    Wang, Jiabing
    He, Kongde
    PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART B-JOURNAL OF ENGINEERING MANUFACTURE, 2024,
  • [5] Fuzzy job shop scheduling problem based on deep reinforcement learning
    Zhu, Jia-Zheng
    Zhang, Hong-Li
    Wang, Cong
    Li, Xin-Kai
    Dong, Ying-Chao
    Kongzhi yu Juece/Control and Decision, 2024, 39 (02): : 595 - 603
  • [6] Optimization of job shop scheduling problem based on deep reinforcement learning
    Qiao, Dongping
    Duan, Lvqi
    Li, Honglei
    Xiao, Yanqiu
    EVOLUTIONARY INTELLIGENCE, 2024, 17 (01) : 371 - 383
  • [7] Optimization of job shop scheduling problem based on deep reinforcement learning
    Dongping Qiao
    Lvqi Duan
    HongLei Li
    Yanqiu Xiao
    Evolutionary Intelligence, 2024, 17 : 371 - 383
  • [8] A Deep Reinforcement Learning Method Based on a Transformer Model for the Flexible Job Shop Scheduling Problem
    Xu, Shuai
    Li, Yanwu
    Li, Qiuyang
    ELECTRONICS, 2024, 13 (18)
  • [9] Low-Carbon Flexible Job Shop Scheduling Problem Based on Deep Reinforcement Learning
    Tang, Yimin
    Shen, Lihong
    Han, Shuguang
    SUSTAINABILITY, 2024, 16 (11)
  • [10] Deep reinforcement learning for dynamic scheduling of a flexible job shop
    Liu, Renke
    Piplani, Rajesh
    Toro, Carlos
    INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2022, 60 (13) : 4049 - 4069