A Reinforcement Learning Motivated Algorithm for Process Optimization

被引：2

作者：

Abraham, Gyula ^{[1
]}

Auer, Peter ^{[2
]}

Dosa, Gyorgy ^{[1
]}

Dulai, Tibor ^{[1
]}

Werner-Stark, Agnes ^{[1
]}

机构：

[1] Univ Pannonia, Fac Informat Technol, Egyet Str 10, H-8200 Veszprem, Hungary

[2] Univ Leoben, Dept Math & Informat Technol, Franz Josef Str 18, A-8700 Leoben, Austria

来源：

PERIODICA POLYTECHNICA-CIVIL ENGINEERING | 2019年 / 63卷 / 04期

关键词：

process scheduling; reinforcement learning; scheduling; resource allocation; APPROXIMATION; BOUNDS;

D O I：

10.3311/PPci.14295

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

In process scheduling problems there are several processes and resources. Any process consists of several tasks, and there may be precedence constraints among them. In our paper we consider a special case, where the precedence constraints form short disjoint (directed) paths. This model occurs frequently in practice, but as far as we know it is considered very rarely in the literature. The goal is to find a good resource allocation (schedule) to minimize the makespan. The problem is known to be strongly NP-hard, and such hard problems are often solved by heuristic methods. We found only one paper which is closely related to our topic, this paper proposes the heuristic method HH. We propose a new heuristic called QLM which is inspired by reinforcement learning methods from the area of machine learning. As we did not find appropriate benchmark problems for the investigated model. We have created such inputs and we have made exhaustive comparisons, comparing the results of HH and QLM, and an exact solver using CPLEX. We note that a heuristic method can give a "near optimal" solution very fast while an exact solver provides the optimal solution, but it may need a huge amount of time to find it. In our computational evaluation we experienced that our heuristic is more effective than HH and finds the optimal solution in many cases and very fast.

引用

页码：961 / 970

页数：10

共 22 条

[1] Dynamic job-shop scheduling using reinforcement learning agents [J].

Aydin, ME ;

Öztemel, E .

ROBOTICS AND AUTONOMOUS SYSTEMS, 2000, 33 (2-3) :169-178

[2]

Chunfeng Liu, 2011, Journal of Software, V6, P1146, DOI 10.4304/jsw.6.6.1146-1153

[3]

Dosa Gy, 2016, MATCOS 16, P80

[4] Restricted assignment scheduling with resource constraints [J].

Dosa, Gyorgy ;

Kellerer, Hans ;

Tuza, Zsolt .

THEORETICAL COMPUTER SCIENCE, 2019, 760 :72-87

[5] Multiprofessor scheduling [J].

Dosa, Gyorgy ;

Tuza, Zsolt .

DISCRETE APPLIED MATHEMATICS, 2018, 234 :195-209

[6]

Gabel T, 2008, Int. J. Inf. Technol. Intell. Comput., V24

[7] BOUNDS ON MULTIPROCESSING TIMING ANOMALIES [J].

GRAHAM, RL .

SIAM JOURNAL ON APPLIED MATHEMATICS, 1969, 17 (02) :416-&

[8] BOUNDS FOR CERTAIN MULTIPROCESSING ANOMALIES [J].

GRAHAM, RL .

BELL SYSTEM TECHNICAL JOURNAL, 1966, 45 (09) :1563-+

[9] Heuristics for unrelated machine scheduling with precedence constraints [J].

Herrmann, J ;

Proth, JM ;

Sauer, N .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 1997, 102 (03) :528-537

[10] Reinforcement learning based resource allocation in business process management [J].

Huang, Zhengxing ;

van der Aalst, W. M. P. ;

Lu, Xudong ;

Duan, Huilong .

DATA & KNOWLEDGE ENGINEERING, 2011, 70 (01) :127-145

← 1 2 3 →