Reinforcement learning for online optimization of job-shop scheduling in a smart manufacturing factory

被引:22
作者
Zhou, Tong [1 ]
Zhu, Haihua [1 ]
Tang, Dunbing [1 ]
Liu, Changchun [1 ]
Cai, Qixiang [1 ]
Shi, Wei [1 ]
Gui, Yong [1 ]
机构
[1] Nanjing Univ Aeronaut & Astronaut, Coll Mech & Elect Engn, 29 Yudao St, Nanjing 210016, Peoples R China
基金
中国国家自然科学基金;
关键词
Job shop; online scheduling; multi-objective optimization; composite reward; reinforcement learning; SYSTEM; GAME; ALGORITHM; GO;
D O I
10.1177/16878132221086120
中图分类号
O414.1 [热力学];
学科分类号
摘要
The job-shop scheduling problem (JSSP) is a complex combinatorial problem, especially in dynamic environments. Low-volume-high-mix orders contain various design specifications that bring a large number of uncertainties to manufacturing systems. Traditional scheduling methods are limited in handling diverse manufacturing resources in a dynamic environment. In recent years, artificial intelligence (AI) arouses the interests of researchers in solving dynamic scheduling problems. However, it is difficult to optimize the scheduling policies for online decision making while considering multiple objectives. Therefore, this paper proposes a smart scheduler to handle real-time jobs and unexpected events in smart manufacturing factories. New composite reward functions are formulated to improve the decision-making abilities and learning efficiency of the smart scheduler. Based on deep reinforcement learning (RL), the smart scheduler autonomously learns to schedule manufacturing resources in real time and improve its decision-making abilities dynamically. We evaluate and validate the proposed scheduling model with a series of experiments on a smart factory testbed. Experimental results show that the smart scheduler not only achieves good learning and scheduling performances by optimizing the composite reward functions, but also copes with unexpected events (e.g. urgent or simultaneous orders, machine failures) and balances between efficiency and profits.
引用
收藏
页数:19
相关论文
共 42 条
[1]  
[Anonymous], 1989, Learning from delayed rewards: A foundation of reinforcement learning
[2]   Artificial Cognition in Production Systems [J].
Bannat, Alexander ;
Bautze, Thibault ;
Beetz, Michael ;
Blume, Juergen ;
Diepold, Klaus ;
Ertelt, Christoph ;
Geiger, Florian ;
Gmeiner, Thomas ;
Gyger, Tobias ;
Knoll, Alois ;
Lau, Christian ;
Lenz, Claus ;
Ostgathe, Martin ;
Reinhart, Gunther ;
Roesel, Wolfgang ;
Ruehr, Thomas ;
Schuboe, Anna ;
Shea, Kristina ;
Wersborg, Ingo Stork Genannt ;
Stork, Sonja ;
Tekouo, William ;
Wallhoff, Frank ;
Wiesbeck, Mathey ;
Zaeh, Michael F. .
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2011, 8 (01) :148-174
[3]   Distributed Scheduling Problems in Intelligent Manufacturing Systems [J].
Fu, Yaping ;
Hou, Yushuang ;
Wang, Zifan ;
Wu, Xinwei ;
Gao, Kaizhou ;
Wang, Ling .
TSINGHUA SCIENCE AND TECHNOLOGY, 2021, 26 (05) :625-645
[4]   Scheduling Dual-Objective Stochastic Hybrid Flow Shop With Deteriorating Jobs via Bi-Population Evolutionary Algorithm [J].
Fu, Yaping ;
Zhou, MengChu ;
Guo, Xiwang ;
Qi, Liang .
IEEE TRANSACTIONS ON SYSTEMS MAN CYBERNETICS-SYSTEMS, 2020, 50 (12) :5037-5048
[6]  
He Y., 2021, INT J PROD RES, V59, P1
[7]   Modelling and optimization of integrated distributed flow shop scheduling and distribution problems with time windows [J].
Hou, Yushuang ;
Fu, Yaping ;
Gao, Kaizhou ;
Zhang, Hui ;
Sadollah, Ali .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 187
[8]   Agent-based fuzzy constraint-directed negotiation mechanism for distributed job shop scheduling [J].
Hsu, Chia-Yu ;
Kao, Bo-Ruei ;
Van Lam Ho ;
Lai, K. Robert .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2016, 53 :140-154
[9]   The evolution of production scheduling from Industry 3.0 through Industry 4.0 [J].
Jiang, Zengqiang ;
Yuan, Shuai ;
Ma, Jing ;
Wang, Qiang .
INTERNATIONAL JOURNAL OF PRODUCTION RESEARCH, 2022, 60 (11) :3534-3554
[10]  
Kardos Csaba, 2021, Procedia CIRP, V97, P104, DOI 10.1016/j.procir.2020.05.210