End-to-End Multitarget Flexible Job Shop Scheduling With Deep Reinforcement Learning

被引：8

作者：

Wang, Rongkai ^{[1
]}

Jing, Yiyang ^{[1
]}

Gu, Chaojie ^{[1
]}

He, Shibo ^{[1
]}

Chen, Jiming ^{[2
]}

机构：

[1] Zhejiang Univ, State Key Lab Ind Control Technol, Hangzhou 310027, Peoples R China

[2] Hangzhou Dianzi Univ, Sch Automat, Hangzhou 310018, Peoples R China

来源：

IEEE INTERNET OF THINGS JOURNAL | 2025年 / 12卷 / 04期

基金：

中国国家自然科学基金;

关键词：

Job shop scheduling; Transportation; Production; Manufacturing; Heuristic algorithms; Energy consumption; Optimal scheduling; Metaheuristics; Dispatching; Computer architecture; Cloud-edge manufacturing paradigm; graph neural network (GNN); multiagent reinforcement learning; multitarget flexible job shop scheduling optimization (MT-F[!text type='JS']JS[!/text]P); ALGORITHM;

D O I：

10.1109/JIOT.2024.3485748

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Modeling and solving the flexible job shop scheduling problem (FJSP) is critical for modern manufacturing. However, existing works primarily focus on the time-related makespan target, often neglecting other practical factors, such as transportation. To address this, we formulate a more comprehensive multitarget FJSP that integrates makespan with varied transportation times and the total energy consumption of processing and transportation. The combination of these multiple real-world production targets renders the scheduling problem highly complex and challenging to solve. To overcome this challenge, this article proposes an end-to-end multiagent proximal policy optimization (PPO) approach. First, we represent the scheduling problem as a disjunctive graph (DG) with designed features of subtasks and constructed machine nodes, additionally integrating information of arcs denoted as transportation and standby time, respectively. Next, we use a graph neural network (GNN) to encode features into node embeddings, representing the states at each decision step. Finally, based on the vectorized value function and local critic networks, the PPO algorithm and DG simulation environment iteratively interact to train the policy network. Our extensive experimental results validate the performance of the proposed approach, demonstrating its superiority over the state-of-the-art in terms of high-quality solutions, online computation time, stability, and generalization.

引用

页码：4420 / 4434

页数：15

共 45 条

[31]

Wang Hai, 2023, 2023 IEEE 39th International Conference on Data Engineering (ICDE), P3522, DOI 10.1109/ICDE55515.2023.00269

[32] Multiagent and Bargaining-Game-Based Real-Time Scheduling for Internet of Things-Enabled Flexible Job Shop [J].

Wang, Jin ;

Zhang, Yingfeng ;

Liu, Yang ;

Wu, Naiqi .

IEEE INTERNET OF THINGS JOURNAL, 2019, 6 (02) :2518-2531

[33] An interoperable and flat Industrial Internet of Things architecture for low latency data collection in manufacturing systems [J].

Wang, Rongkai ;

Gu, Chaojie ;

He, Shibo ;

Shi, Zhiguo ;

Meng, Wenchao .

JOURNAL OF SYSTEMS ARCHITECTURE, 2022, 129

[34] Time-Constrained Actor-Critic Reinforcement Learning for Concurrent Order Dispatch in On-Demand Delivery [J].

Wang, Shuai ;

Guo, Baoshen ;

Ding, Yi ;

Wang, Guang ;

He, Suining ;

Zhang, Desheng ;

He, Tian .

IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (08) :8175-8192

[35] Smart scheduling of dynamic job shop based on discrete event simulation and deep reinforcement learning [J].

Wang, Ziqing ;

Liao, Wenzhu .

JOURNAL OF INTELLIGENT MANUFACTURING, 2024, 35 (06) :2593-2610

[36] A survey of job shop scheduling problem: The types and models [J].

Xiong, Hegen ;

Shi, Shuangyuan ;

Ren, Danni ;

Hu, Jinjin .

COMPUTERS & OPERATIONS RESEARCH, 2022, 142

[37]

Xu J., 2020, INT C MACHINE LEARNI, V1, P10607

[38]

Xu K., 2018, arXiv

[39]

Yang RZ, 2019, ADV NEUR IN, V32

[40] Solving flexible job shop scheduling problems via deep reinforcement learning [J].

Yuan, Erdong ;

Wang, Liejun ;

Cheng, Shuli ;

Song, Shiji ;

Fan, Wei ;

Li, Yongming .

EXPERT SYSTEMS WITH APPLICATIONS, 2024, 245

← 1 2 3 4 5 →