Dynamic job-shop scheduling in smart manufacturing using deep reinforcement learning

被引:158
作者
Wang, Libing [1 ]
Hu, Xin [1 ]
Wang, Yin [1 ]
Xu, Sujie [1 ]
Ma, Shijun [1 ]
Yang, Kexin [2 ]
Liu, Zhijun [1 ]
Wang, Weidong [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Sch Elect Engn, Beijing 100876, Peoples R China
[2] Beijing Univ Posts & Telecommun, Int Sch, Beijing 100876, Peoples R China
关键词
Smart manufacturing; Job-shop scheduling; Deep reinforcement learning; Proximal policy optimization; ALGORITHM; ALLOCATION;
D O I
10.1016/j.comnet.2021.107969
中图分类号
TP3 [计算技术、计算机技术];
学科分类号
0812 ;
摘要
Job-shop scheduling problem (JSP) is used to determine the processing order of the jobs and is a typical scheduling problem in smart manufacturing. Considering the dynamics and the uncertainties such as machine breakdown and job rework of the job-shop environment, it is essential to flexibly adjust the scheduling strategy according to the current state. Traditional methods can only obtain the optimal solution at the current time and need to rework if the state changes, which leads to high time complexity. To address the issue, this paper proposes a dynamic scheduling method based on deep reinforcement learning (DRL). In the proposed method, we adopt the proximal policy optimization (PPO) to find the optimal policy of the scheduling to deal with the dimension disaster of the state and action space caused by the increase of the problem scale. Compared with the traditional scheduling methods, the experimental results show that the proposed method can not only obtain comparative results but also can realize adaptive and real-time production scheduling.
引用
收藏
页数:9
相关论文
共 26 条
[1]  
Applegate D., 1991, ORSA Journal on Computing, V3, P149, DOI 10.1287/ijoc.3.2.149
[2]  
BEASLEY JE, 1990, J OPER RES SOC, V41, P1069, DOI 10.2307/2582903
[3]   Towards Accurate Prediction for High-Dimensional and Highly-Variable Cloud Workloads with Deep Learning [J].
Chen, Zheyi ;
Hu, Jia ;
Min, Geyong ;
Zomaya, Albert Y. ;
El-Ghazawi, Tarek .
IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, 2020, 31 (04) :923-934
[4]  
Fairee Suthida, 2019, 2019 16th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology (ECTI-CON). Proceedings, P93, DOI 10.1109/ECTI-CON47248.2019.8955176
[5]  
Garey M. R., 1976, Mathematics of Operations Research, V1, P117, DOI 10.1287/moor.1.2.117
[6]  
Gromniak M, 2019, 2019 4TH ASIA-PACIFIC CONFERENCE ON INTELLIGENT ROBOT SYSTEMS (ACIRS 2019), P68, DOI [10.1109/acirs.2019.8935944, 10.1109/ACIRS.2019.8935944]
[7]  
Horn S, 2006, INT SPR SEM ELECT TE, P228
[8]   Multi-Agent Deep Reinforcement Learning-Based Flexible Satellite Payload for Mobile Terminals [J].
Hu, Xin ;
Liao, Xianglai ;
Liu, Zhijun ;
Liu, Shuaijun ;
Ding, Xin ;
Helaoui, Mohamed ;
Wang, Weidong ;
Ghannouchi, Fadhel M. .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (09) :9849-9865
[9]   Dynamic Beam Hopping Method Based on Multi-Objective Deep Reinforcement Learning for Next Generation Satellite Broadband Systems [J].
Hu, Xin ;
Zhang, Yuchen ;
Liao, Xianglai ;
Liu, Zhijun ;
Wang, Weidong ;
Ghannouchi, Fadhel M. .
IEEE TRANSACTIONS ON BROADCASTING, 2020, 66 (03) :630-646
[10]   Deep reinforcement learning-based beam Hopping algorithm in multibeam satellite systems [J].
Hu, Xin ;
Liu, Shuaijun ;
Wang, Yipeng ;
Xu, Lexi ;
Zhang, Yuchen ;
Wang, Cheng ;
Wang, Weidong .
IET COMMUNICATIONS, 2019, 13 (16) :2485-2491