Neuroevolution reinforcement learning for multi-echelon inventory optimization with delivery options and uncertain discount

被引：4

作者：

Rizqi, Zakka Ugih ^{[1
]}

Chou, Shuo-Yan ^{[1
,2
]}

机构：

[1] Natl Taiwan Univ Sci & Technol, Dept Ind Management, 43 Keelung Rd, Taipei 10607, Taiwan

[2] Natl Taiwan Univ Sci & Technol, Intelligent Mfg Innovat Ctr, 43 Keelung Rd, Taipei 10607, Taiwan

来源：

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE | 2024年 / 134卷

关键词：

Multi-echelon inventory; Optimization; Reinforcement learning; Simulation; Supply chain; MANAGEMENT; POLICIES;

D O I：

10.1016/j.engappai.2024.108670

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

The advanced information technology has enabled supply chain to make centralized optimal decision, allowing to make a global optimal solution. However, dealing with uncertainty is important in inventory management. Besides demand and supply uncertainties, supplier discounts also often arise unexpectedly. Further, suppliers or third-parties typically offer various delivery options in which trade-off occurs between cost and lead time. Thus, this study introduces new problem namely Multi-Echelon Inventory Optimization with Delivery Options and Uncertain Discount (MEIO-DO-UD). As a solution, Neuroevolution Reinforcement Learning (NERL) framework is developed for minimizing total system cost. The environment is modeled via System Dynamics (SD) and actor is presented by integration of Artificial Neural Network and Evolutionary Algorithm (EA), creating an effective decision-making model under dynamic uncertainty. The experimental study has been conducted where two different supply chain networks are given namely serial and divergence. Three EA algorithms are compared namely Differential Evolution (DE), Memetic Algorithm (MA), and Evolution Strategy (ES). Furthermore, NERL is also compared with the EA-optimized classical continuous review model namely (s,Q). The result shows that regardless what EA type is used, the proposed NERL always outperforms EA-optimized (s,Q) model. The more complex the problem, the further improvement can be made i.e. cost reduction up to 58%, followed by the fill rate improvement. The result also shows that NERL can avoid overfitting. Managerial implications are highlighted where NERL provides the more stable inventory level among all supply chain partners and bull-whip effect can be damped.

引用

页数：13

共 49 条

[1] Training circuit-based quantum classifiers through memetic algorithms [J].

Acampora, Giovanni ;

Chiatto, Angela ;

Vitiello, Autilia .

PATTERN RECOGNITION LETTERS, 2023, 170 :32-38

[2] Reinforcement Learning Algorithms: An Overview and Classification [J].

AlMahamid, Fadi ;

Grolinger, Katarina .

2021 IEEE CANADIAN CONFERENCE ON ELECTRICAL AND COMPUTER ENGINEERING (CCECE), 2021,

[3]

Anyibuofu K.A, 2014, Industrial Engineering Letters

[4] Differential Evolution for Neural Networks Optimization [J].

Baioletti, Marco ;

Di Bari, Gabriele ;

Milani, Alfredo ;

Poggioni, Valentina .

MATHEMATICS, 2020, 8 (01)

[5]

Barmi Z.A., 2011, 2011 IEEE 4 INT C SO

[6] Optimal soft-order revisions under demand and supply uncertainty and upstream information [J].

Baruah, Pundarikaksha ;

Chinnam, Ratna Babu ;

Korostelev, Alexander ;

Dalkiran, Evrim .

INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2016, 182 :14-25

[7] Deep reinforcement learning for inventory control: A roadmap [J].

Boute, Robert N. ;

Gijsbrechts, Joren ;

van Jaarsveld, Willem ;

Vanvuchelen, Nathalie .

EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2022, 298 (02) :401-412

[8]

BrahimInsaf O., 2019, Eur. J. Eng. Sci. Technol, DOI [10.33422/ejest.2019.07.08, DOI 10.33422/EJEST.2019.09.38]

[9]

Chandel S., 2012, Innovative Energy Policies, V2, P1, DOI [DOI 10.4303/IEP/E110201, https://doi.org/10.4303/iep/E110201]

[10] A deep q-learning-based optimization of the inventory control in a linear process chain [J].

Dittrich, M. -A. ;

Fohlmeister, S. .

PRODUCTION ENGINEERING-RESEARCH AND DEVELOPMENT, 2021, 15 (01) :35-43

← 1 2 3 4 5 →