Multi-objective reinforcement learning framework for dynamic flexible job shop scheduling problem with uncertain events

被引：57

作者：

Wang, Hao ^{[1
]}

Cheng, Junfu ^{[1
]}

Liu, Chang ^{[1
,2
]}

Zhang, Yuanyuan ^{[1
]}

Hu, Shunfang ^{[1
]}

Chen, Liangyin ^{[1
,3
]}

机构：

[1] Sichuan Univ, Sch Comp Sci, Chengdu 610065, Peoples R China

[2] Civil Aviat Adm, Res Inst 2, Chengdu 610041, Peoples R China

[3] Sichuan Univ, Inst Ind Internet Res, Chengdu 610065, Peoples R China

来源：

APPLIED SOFT COMPUTING | 2022年 / 131卷

关键词：

scheduling problem; Real-time processing framework; Deep reinforcement learning; Local search algorithm; Dynamic multi-objective flexible job shop; GENETIC ALGORITHM; TABU SEARCH;

D O I：

10.1016/j.asoc.2022.109717

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The economic benefits for manufacturing companies will be influenced by how it handles potential dynamic events and performs multi-objective real-time scheduling for existing dynamic events. Based on these, we propose a new dynamic multi-objective flexible job shop scheduling problem (DMFJSP) to simulate realistic production environment. Six dynamic events are involved in the problem including job insertion, job cancellation, job operation modification, machine addition, machine tool replacement and machine breakdown. As well as three objectives of longest job processing time (makespan), average machine utilization and average job processing delay rate with a set of constraints are also raised in the study. Then this research designs a novel dynamic multi-objective scheduling algorithm based on deep reinforcement learning. The algorithm uses two deep Q-learning networks and a real-time processing framework to process each dynamic event and generate complete scheduling scheme. In addition, an improved local search algorithm is adopted to further optimize the scheduling results and the idea of combination is used to make the scheduling rules more comprehensive. Experiments on 27 instances show the superiority and stability of our approach compared to each proposed combined rule, well-known scheduling rules and standard deep Q-learning based algorithms. Compared to the current optimal deep Q-learning method, the maximum performance improvement for our three objectives are approximately 57%, 164% and 28%.(c) 2022 Published by Elsevier B.V.

引用

页数：19

共 56 条

[1] Dynamic scheduling for multi-site companies: a decisional approach based on reinforcement multi-agent learning [J].

Aissani, N. ;

Bekrar, A. ;

Trentesaux, D. ;

Beldjilali, B. .

JOURNAL OF INTELLIGENT MANUFACTURING, 2012, 23 (06) :2513-2529

[2]

Aissani N., 2009, IFAC Proc. Vol, V42, P1102, DOI [10.3182/20090603-3-RU-2001.0280, DOI 10.3182/20090603-3-RU-2001.0280]

[3] Reinforcement learning for an intelligent and autonomous production control of complex job-shops under time constraints [J].

Altenmueller, Thomas ;

Stueker, Tillmann ;

Waschneck, Bernd ;

Kuhnle, Andreas ;

Lanza, Gisela .

PRODUCTION ENGINEERING-RESEARCH AND DEVELOPMENT, 2020, 14 (03) :319-328

[4]

Ammari A.C., 2013, P 2013 INT C IND ENG, P1

[5]

[Anonymous], 2013, P ADV NEUR INF PROC

[6] Dynamic job-shop scheduling using reinforcement learning agents [J].

Aydin, ME ;

Öztemel, E .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2000, 33 (2-3) :169-178

[7] A MARKOVIAN DECISION PROCESS [J].

BELLMAN, R .

JOURNAL OF MATHEMATICS AND MECHANICS, 1957, 6 (05) :679-684

[8] A distributed approach solving partially flexible job-shop scheduling problem with a Q-learning effect [J].

Bouazza, W. ;

Sallez, Y. ;

Beldjilali, B. .

IFAC PAPERSONLINE, 2017, 50 (01) :15890-15895

[9]

Brandimarte P., 1993, Annals of Operations Research, V41, P157, DOI 10.1007/BF02023073

[10] A Pareto based discrete Jaya algorithm for multi-objective flexible job shop scheduling problem [J].

Caldeira, Rylan H. ;

Gnanavelbabu, A. .

EXPERT SYSTEMS WITH APPLICATIONS, 2021, 170

← 1 2 3 4 5 6 →