Preference learning based deep reinforcement learning for flexible job shop scheduling problem

被引：3

作者：

Liu, Xinning ^{[1
]}

Han, Li ^{[1
]}

Kang, Ling ^{[2
]}

Liu, Jiannan ^{[1
]}

Miao, Huadong ^{[3
]}

机构：

[1] Dalian Neusoft Univ Informat, Sch Comp & Software, Dalian 116023, Liaoning, Peoples R China

[2] Dalian Neusoft Univ Informat, Neusoft Res Inst, Dalian 116023, Liaoning, Peoples R China

[3] SNOW China Beijing Co Ltd, Dalian Branch, Dalian 116023, Liaoning, Peoples R China

来源：

COMPLEX & INTELLIGENT SYSTEMS | 2025年 / 11卷 / 02期

关键词：

Flexible job shop scheduling problem; Preference learning; Proximal policy optimization; Deep reinforcement learning; BENCHMARKS; ALGORITHM;

D O I：

10.1007/s40747-024-01772-x

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The flexible job shop scheduling problem (FJSP) holds significant importance in both theoretical research and practical applications. Given the complexity and diversity of FJSP, improving the generalization and quality of scheduling methods has become a hot topic of interest in both industry and academia. To address this, this paper proposes a Preference-Based Mask-PPO (PBMP) algorithm, which leverages the strengths of preference learning and invalid action masking to optimize FJSP solutions. First, a reward predictor based on preference learning is designed to model reward prediction by comparing random fragments, eliminating the need for complex reward function design. Second, a novel intelligent switching mechanism is introduced, where proximal policy optimization (PPO) is employed to enhance exploration during sampling, and masked proximal policy optimization (Mask-PPO) refines the action space during training, significantly improving efficiency and solution quality. Furthermore, the Pearson correlation coefficient (PCC) is used to evaluate the performance of the preference model. Finally, comparative experiments on FJSP benchmark instances of varying sizes demonstrate that PBMP outperforms traditional scheduling strategies such as dispatching rules, OR-Tools, and other deep reinforcement learning (DRL) algorithms, achieving superior scheduling policies and faster convergence. Even with increasing instance sizes, preference learning proves to be an effective reward mechanism in reinforcement learning for FJSP. The ablation study further highlights the advantages of each key component in the PBMP algorithm across performance metrics.

引用

页数：23

共 50 条

[41] Dynamic multi-objective scheduling for flexible job shop by deep reinforcement learning [J].

Luo, Shu ;

Zhang, Linxuan ;

Fan, Yushun .

COMPUTERS & INDUSTRIAL ENGINEERING, 2021, 159

[42] Learning to Dispatch for Flexible Job Shop Scheduling Based on Deep Reinforcement Learning via Graph Gated Channel Transformation [J].

Huang, Dainlin ;

Zhao, Hong ;

Zhang, Lijun ;

Chen, Kangping .

IEEE ACCESS, 2024, 12 :50935-50948

[43] Solving job shop scheduling problems via deep reinforcement learning [J].

Yuan, Erdong ;

Cheng, Shuli ;

Wang, Liejun ;

Song, Shiji ;

Wu, Fang .

APPLIED SOFT COMPUTING, 2023, 143

[44] An effective deep actor-critic reinforcement learning method for solving the flexible job shop scheduling problem [J].

Wan L. ;

Cui X. ;

Zhao H. ;

Li C. ;

Wang Z. .

Neural Computing and Applications, 2024, 36 (20) :11877-11899

[45] Optimal Design of Flexible Job Shop Scheduling Under Resource Preemption Based on Deep Reinforcement Learning [J].

Chen Z. ;

Zhang L. ;

Wang X. ;

Gu P. .

Complex System Modeling and Simulation, 2022, 2 (02) :174-185

[46] Deep reinforcement learning-based spatio-temporal graph neural network for solving job shop scheduling problem [J].

Gebreyesus, Goytom ;

Fellek, Getu ;

Farid, Ahmed ;

Hou, Sicheng ;

Fujimura, Shigeru ;

Yoshie, Osamu .

EVOLUTIONARY INTELLIGENCE, 2025, 18 (01)

[47] Deep reinforcement learning for solving efficient and energy-saving flexible job shop scheduling problem with multi-AGV [J].

Cheng, Weiyao ;

Zhang, Chaoyong ;

Meng, Leilei ;

Zhang, Biao ;

Gao, Kaizhou ;

Sang, Hongyan .

COMPUTERS & OPERATIONS RESEARCH, 2025, 181

[48] Diverse policy generation for the flexible job-shop scheduling problem via deep reinforcement learning with a novel graph representation [J].

Echeverria, Imanol ;

Murua, Maialen ;

Santana, Roberto .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2025, 139

[49] Graph neural network and expert-guided deep reinforcement learning for solving flexible job-shop scheduling problem [J].

Zhang, Wenqiang ;

Bao, Xuan ;

Geng, Huili ;

Zhang, Guohui ;

Gen, Mitsuo .

COMPUTERS & OPERATIONS RESEARCH, 2025, 183

[50] A REINFORCEMENT LEARNING AND THE NORTHERN GOSHAWK OPTIMIZATION ALGO- RITHM FOR FLEXIBLE JOB SHOP SCHEDULING PROBLEM [J].

Shao, Changshun ;

Yu, Zhenglin ;

Hou, Han ;

Ding, Hongchang ;

Cao, Guohua ;

Zhou, Bin .

INTERNATIONAL JOURNAL OF INDUSTRIAL ENGINEERING-THEORY APPLICATIONS AND PRACTICE, 2025, 32 (01) :34-51

← 1 2 3 4 5 →