Gated-Attention Model with Reinforcement Learning for Solving Dynamic Job Shop Scheduling Problem

被引：17

作者：

Gebreyesus, Goytom ^{[1
]}

Fellek, Getu ^{[1
]}

Farid, Ahmed ^{[1
]}

Fujimura, Shigeru ^{[1
]}

Yoshie, Osamu ^{[1
]}

机构：

[1] Waseda Univ, Grad Sch Informat Prod & Syst, Fukuoka, Japan

来源：

IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING | 2023年 / 18卷 / 06期

关键词：

deep reinforcement learning; job shop scheduling; gated attention mechanism; MEAN WEIGHTED TARDINESS; SEARCH ALGORITHM;

D O I：

10.1002/tee.23788

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Job shop scheduling problem (JSSP) is one of the well-known NP-hard combinatorial optimization problems (COPs) that aims to optimize the sequential assignment of finite machines to a set of jobs while adhering to specified problem constraints. Conventional solution approaches which include heuristic dispatching rules and evolutionary algorithms has been largely in use to solve JSSPs. Recently, the use of reinforcement learning (RL) has gained popularity for delivering better solution quality for JSSPs. In this research, we propose an end-to-end deep reinforcement learning (DRL) based scheduling model for solving the standard JSSP. Our DRL model uses attention-based encoder of Transformer network to embed the JSSP environment represented as a disjunctive graph. We introduced Gate mechanism to modulate the flow of learnt features by preventing noise features from propagating across the network to enrich the representations of nodes of the disjunctive graph. In addition, we designed a novel Gate-based graph pooling mechanism that preferentially constructs the graph embedding. A simple multi-layer perceptron (MLP) based action selection network is used for sequentially generating optimal schedules. The model is trained using proximal policy optimization (PPO) algorithm which is built on actor critic (AC) framework. Experimental results show that our model outperforms existing heuristics and state of the art DRL based baselines on generated instances and well-known public test benchmarks. (c) 2023 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.

引用

页码：932 / 944

页数：13

共 43 条

[1] Dynamic Agent-based Bi-objective Robustness for Tardiness and Energy in a Dynamic Flexible Job Shop [J].

Alotaibi, Abdulaziz ;

Lohse, Niels ;

Tuong Manh Vu .

FACTORIES OF THE FUTURE IN THE DIGITAL ENVIRONMENT, 2016, 57 :728-733

[2]

[Anonymous], 2018, P INT C LEARN REPR I

[3]

[Anonymous], 2020, BEST KNOWN LOWER UPP

[4]

[Anonymous], 2018, Reinforcement learning for solving the vehicle routing problem

[5]

Bello I., 2017, WORKSH TRACK ICLR

[6] A Deep Reinforcement Learning Framework Based on an Attention Mechanism and Disjunctive Graph Embedding for the Job-Shop Scheduling Problem [J].

Chen, Ruiqi ;

Li, Wenxin ;

Yang, Hongbing .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2023, 19 (02) :1322-1331

[7] Intelligent Scheduling with Reinforcement Learning [J].

Cunha, Bruno ;

Madureira, Ana ;

Fonseca, Benjamim ;

Matos, Joao .

APPLIED SCIENCES-BASEL, 2021, 11 (08)

[8]

Dai HJ, 2018, Arxiv, DOI arXiv:1704.01665

[9] Learning Heuristics for the TSP by Policy Gradient [J].

Deudon, Michel ;

Cournut, Pierre ;

Lacoste, Alexandre ;

Adulyasak, Yossiri ;

Rousseau, Louis-Martin .

INTEGRATION OF CONSTRAINT PROGRAMMING, ARTIFICIAL INTELLIGENCE, AND OPERATIONS RESEARCH, CPAIOR 2018, 2018, 10848 :170-181

[10] A Hybrid Particle-Swarm Tabu Search Algorithm for Solving Job Shop Scheduling Problems [J].

Gao, Hao ;

Kwong, Sam ;

Fan, Baojie ;

Wang, Ran .

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS, 2014, 10 (04) :2044-2054

← 1 2 3 4 5 →