Optimization of Apparel Supply Chain Using Deep Reinforcement Learning

被引:9
作者
Chong, Ji Won [1 ]
Kim, Wooju [1 ]
Hong, Jun Seok [2 ]
机构
[1] Yonsei Univ, Dept Ind Engn, Seoul 03722, South Korea
[2] Kyonggi Univ, Dept Management Informat Syst, Suwon 16227, Gyeonggi Do, South Korea
关键词
Inventory management; Optimization; Inventory control; Transportation; Supply chains; Supply chain management; Reinforcement learning; Deep learning; Markov processes; Deep reinforcement learning; inventory management; markov decision process; supply chain management; soft actor critic;
D O I
10.1109/ACCESS.2022.3205720
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
An effective supply chain management system is indispensable for an enterprise with a supply chain network in several aspects. Especially, organized control over the production and transportation of its products is a key success factor for the enterprise to stay active without damaging its reputation. This case is also highly relevant to garment industries. In this study, an extensive Deep Reinforcement Learning study for apparel supply chain optimization is proposed and undertaken, with focus given to Soft Actor-Critic. Six models are experimented with in this study and are compared with respect to the sell-through rate, service level, and inventory-to-sales ratio. Soft Actor-Critic outperformed several other state-of-the-art Actor Critic models in managing inventories and fulfilling demands. Furthermore, explicit indicators are calculated to assess the performance of the models in the experiment. Soft Actor-Critic achieved a better balance between service level and sell-through rate by ensuring higher availability of the stocks to sell without overstocking. From numerical experiments, it has been shown that S-policy, Trust Region Policy Optimization, and Twin Delayed Deep Deterministic Policy Gradient have a good balance between service level and sell-through rate. Additionally, Soft Actor-Critic achieved a 7%, 41.6%, and 42.8% lower inventory sales ratio than the S-policy, Twin Delayed Deep Deterministic Policy Gradient, and Trust Region Policy Optimization models, indicating its superior ability in making the inventory stocks available to make sales and profit from them.
引用
收藏
页码:100367 / 100375
页数:9
相关论文
共 27 条
[21]  
Snoek J, 2012, Arxiv, DOI [arXiv:1206.2944, DOI 10.48550/ARXIV.1206.2944]
[22]  
Stranieri F, 2023, Arxiv, DOI arXiv:2204.09603
[23]  
Sutton RS, 2018, ADAPT COMPUT MACH LE, P1
[24]   Heuristics for setting reorder levels in periodic review inventory systems with an aggregate service constraint [J].
van Donselaar, Karel ;
Broekmeulen, Rob ;
de Kok, Ton .
INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2021, 237
[25]   Use of Proximal Policy Optimization for the Joint Replenishment Problem [J].
Vanvuchelen, Nathalie ;
Gijsbrechts, Joren ;
Boute, Robert .
COMPUTERS IN INDUSTRY, 2020, 119
[26]   A mathematical programming-based solution method for the nonstationary inventory problem under correlated demand [J].
Xiang, Mengyuan ;
Rossi, Roberto ;
Martin-Barragan, Belen ;
Tarim, Armagan .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2023, 304 (02) :515-524
[27]   Deep reinforcement learning approach for solving joint pricing and inventory problem with reference price effects [J].
Zhou, Qiang ;
Yang, Yefei ;
Fu, Shaochuan .
EXPERT SYSTEMS WITH APPLICATIONS, 2022, 195