Dynamic parallel machine scheduling with mean weighted tardiness objective by Q-Learning

被引：0

作者：

Zhicong Zhang

Li Zheng

Michael X. Weng

机构：

[1] Tsinghua University,Department of Industrial Engineering

[2] University of South Florida,Department of Industrial and Management Systems Engineering

来源：

The International Journal of Advanced Manufacturing Technology | 2007年 / 34卷

关键词：

Scheduling; Parallel machine; Reinforcement learning; Q-Learning;

D O I：

暂无

中图分类号：

学科分类号：

摘要：

In this paper, we discuss a dynamic unrelated parallel machine scheduling problem with sequence-dependant setup times and machine–job qualification consideration. To apply the Q-Learning algorithm, we convert the scheduling problem into reinforcement learning problems by constructing a semi-Markov decision process (SMDP), including the definition of state representation, actions and the reward function. We use five heuristics, WSPT, WMDD, WCOVERT, RATCS and LFJ-WCOVERT, as actions and prove the equivalence of the reward function and the scheduling objective: minimisation of mean weighted tardiness. We carry out computational experiments to examine the performance of the Q-Learning algorithm and the heuristics. Experiment results show that Q-Learning always outperforms all heuristics remarkably. Averaged over all test problems, the Q-Learning algorithm achieved performance improvements over WSPT, WMDD, WCOVERT, RATCS and LFJ-WCOVERT by considerable amounts of 61.38%, 60.82%, 56.23%, 57.48% and 66.22%, respectively.

引用

页码：968 / 980

页数：12

共 50 条

[21] A Weighted Smooth Q-Learning Algorithm
Vijesh, V. Antony
Shreyas, S. R.
IEEE CONTROL SYSTEMS LETTERS, 2025, 9 : 21 - 26
[22] Parallel Implementation of Reinforcement Learning Q-Learning Technique for FPGA
Da Silva, Lucileide M. D.
Torquato, Matheus F.
Fernandes, Marcelo A. C.
IEEE ACCESS, 2019, 7 : 2782 - 2798
[23] Q-Learning Based Scheduling With Successive Interference Cancellation
Mete, Ezgi
Girici, Tolga
IEEE ACCESS, 2020, 8 : 172034 - 172042
[24] Metaheuristics for the single machine weighted quadratic tardiness scheduling problem
Goncalves, Tomas C.
Valente, Jorge M. S.
Schaller, Jeffrey E.
COMPUTERS & OPERATIONS RESEARCH, 2016, 70 : 115 - 126
[25] Bi-criteria parallel batch machine scheduling to minimize total weighted tardiness and electricity cost
Rocholl J.
Mönch L.
Fowler J.
Journal of Business Economics, 2020, 90 (9) : 1345 - 1381
[26] Approximation algorithms for scheduling problems with a modified total weighted tardiness objective
Kolliopoulos, Stavros G.
Steiner, George
OPERATIONS RESEARCH LETTERS, 2007, 35 (05) : 685 - 692
[27] (Data-Driven) Development of dynamic scheduling in semiconductor manufacturing using a Q-learning approach
Shiue, Yeou-Ren
Lee, Ken-Chuan
Su, Chao-Ton
INTERNATIONAL JOURNAL OF COMPUTER INTEGRATED MANUFACTURING, 2022, 35 (10-11) : 1188 - 1204
[28] Solving Twisty Puzzles Using Parallel Q-learning
Hukmani, Kavish
Kolekar, Sucheta
Vobugari, Sreekumar
ENGINEERING LETTERS, 2021, 29 (04) : 1535 - 1543
[29] Weighted Q-learning for optimal dynamic treatment regimes with nonignorable missing covariates
Sun, Jian
Fu, Bo
Su, Li
BIOMETRICS, 2025, 81 (01)
[30] TOTAL WEIGHTED TARDINESS FOR SCHEDULING MAPREDUCE JOBS ON PARALLEL BATCH MACHINES
Wang, Zhaojie
Zheng, Feifeng
Xu, Yinfeng
Liu, Ming
Sun, Lihua
JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2023, 19 (08) : 5953 - 5968

← 1 2 3 4 5 →