Reinforcement learning approaches for specifying ordering policies of perishable inventory systems

被引:88
作者
Kara, Ahmet [1 ]
Dogan, Ibrahim [1 ]
机构
[1] Erciyes Univ, Ind Engn Dept, Kayseri, Turkey
关键词
Reinforcement learning; Inventory management system; Simulation-based optimization; Ordering management; Perishable item; SUPPLY-CHAIN; MANAGEMENT; OPTIMIZATION; SIMULATION; LEVEL; TIME;
D O I
10.1016/j.eswa.2017.08.046
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
In this study, we deal with the inventory management system of perishable products under the random demand and deterministic lead time in order to minimize the total cost of a retailer. We investigate two different ordering policies to emphasize the importance of the age information in the perishable inventory systems using Reinforcement Learning (RL). Stock-based policy replenishes stocks according to the stock quantities, and Age-based policy considers both inventory level and the age of the items in stock. The problem considered in this article has been modeled using Reinforcement Learning and the policies are optimized using Q-learning and Sarsa algorithms. The performance of the proposed policies compared with similar policies from the literature. The experiments demonstrate that the ordering policy which takes into account the age information appears to be an acceptable policy and learning with RL provides better results when demand has high variance and products has short lifetimes. (C) 2017 Elsevier Ltd. All rights reserved.
引用
收藏
页码:150 / 158
页数:9
相关论文
共 35 条
[1]   An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown [J].
Almahdi, Saud ;
Yang, Steve Y. .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 87 :267-279
[2]  
[Anonymous], 1989, LEARNING DELAYED REW
[3]  
[Anonymous], 2015, Reinforcement Learning: An Introduction
[4]   Review of inventory systems with deterioration since 2001 [J].
Bakker, Monique ;
Riezebos, Jan ;
Teunter, Ruud H. .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 221 (02) :275-284
[5]   Supply chain management of blood products: A literature review [J].
Belien, Jeroen ;
Force, Hein .
EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2012, 217 (01) :1-16
[6]   A heuristic to manage perishable inventory with batch ordering, positive lead-times, and time-varying demand [J].
Broekmeulen, Rob A. C. M. ;
van Donselaar, Karel H. .
COMPUTERS & OPERATIONS RESEARCH, 2009, 36 (11) :3013-3018
[7]   A reinforcement learning model for supply chain ordering management: An application to the beer game [J].
Chaharsooghi, S. Kamal ;
Heydari, Jafar ;
Zegordi, S. Hessameddin .
DECISION SUPPORT SYSTEMS, 2008, 45 (04) :949-959
[8]  
Darken C., 1992, Neural Networks for Signal Processing II. Proceedings of the IEEE-SP Workshop (Cat. No.92TH0430-9), P3, DOI 10.1109/NNSP.1992.253713
[9]   Solving semi-Markov decision problems using average reward reinforcement learning [J].
Das, TK ;
Gosavi, A ;
Mahadevan, S ;
Marchalleck, N .
MANAGEMENT SCIENCE, 1999, 45 (04) :560-574
[10]   A reinforcement learning approach to competitive ordering and pricing problem [J].
Dogan, Ibrahim ;
Guener, Ali R. .
EXPERT SYSTEMS, 2015, 32 (01) :39-48