Risk-averse chain via robust reinforcement

被引:1
|
作者
Wang, Jing [1 ]
Swartz, Christopher L. E. [2 ]
Huang, Kai [3 ]
机构
[1] McMaster Univ, Sch Computat Sci & Engn, 1280 Main St West, Hamilton, ON L8S 4K1, Canada
[2] McMaster Univ, Dept Chem Engn, 1280 Main St West, Hamilton, ON L8S 4L7, Canada
[3] McMaster Univ, DeGroote Sch Business, 1280 Main St West, Hamilton, ON L8S 4M4, Canada
关键词
Supply chain management; Reinforcement learning; Risk management; Worst-case criterion; Closed-loop supply chain; Supply chain simulation; SUPPLY-CHAIN; ORDERING MANAGEMENT; INVENTORY CONTROL; PROCESS SYSTEMS; BIG DATA; OPTIMIZATION; UNCERTAINTY; MODEL; NETWORK;
D O I
10.1016/j.compchemeng.2024.108912
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Classical reinforcement learning (RL) may suffer performance degradation when the environment deviates from training conditions, limiting its application in risk-averse supply chain management. This work explores using robust RL in supply chain operations to hedge against environment inconsistencies and changes. Two robust RL algorithms, Q-learning and beta-pessimistic Q-learning, are examined against conventional Q-learning and a baseline order-up-to inventory policy. Furthermore, this work extends RL applications from forward to closed-loop supply chains. Two case studies are conducted using a supply chain simulator developed with agent-based modeling. The results show that Q-learning can outperform the baseline policy under normal conditions, but notably degrades under environment deviations. By comparison, the robust RL models tend to make more conservative inventory decisions to avoid large shortage penalties. Specifically, fine-tuned beta-pessimistic Q-learning can achieve good performance under normal conditions and maintain robustness against moderate environment inconsistencies, making it suitable for risk-averse decision-making.
引用
收藏
页数:17
相关论文
共 50 条
  • [1] A survey on risk-averse and robust revenue management
    Goensch, Jochen
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2017, 263 (02) : 337 - 348
  • [2] A risk-averse distributionally robust project scheduling model to address
    Bruni, Maria Elena
    Hazir, Oencu
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2024, 318 (02) : 398 - 407
  • [3] Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning
    Ma, Xiaoteng
    Ma, Shuai
    Xia, Li
    Zhao, Qianchuan
    JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2022, 75 : 569 - 595
  • [4] Robust risk-averse unit commitment with solar PV systems
    Raygani, Saeid Veysi
    Forbes, Michael
    Martin, Daniel
    IET RENEWABLE POWER GENERATION, 2020, 14 (15) : 2966 - 2975
  • [5] Channel bargaining with risk-averse retailer
    Ma, Lijun
    Liu, Fangmei
    Li, Sijie
    Yan, Houmin
    INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2012, 139 (01) : 155 - 167
  • [6] Stackelberg Game of Buyback Policy in Supply Chain with a Risk-Averse Retailer and a Risk-Averse Supplier Based on CVaR
    Zhou, Yanju
    Chen, Qian
    Chen, Xiaohong
    Wang, Zongrun
    PLOS ONE, 2014, 9 (09):
  • [7] EFFECTS OF DISRUPTION RISK ON A SUPPLY CHAIN WITH A RISK-AVERSE RETAILER
    Li, Min
    Zhang, Jiahua
    Xu, Yifan
    Wang, Wei
    JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION, 2022, 18 (02) : 1365 - 1391
  • [8] The coordination mechanism of a risk-averse green supply chain
    Wang, Yuhong
    Sheng, Xiaoqi
    Xie, Yudie
    CHINESE MANAGEMENT STUDIES, 2024, 18 (01) : 174 - 195
  • [9] Pricing and product line strategy in a supply chain with risk-averse players
    Xiao, Tiaojun
    Xu, Tiantian
    INTERNATIONAL JOURNAL OF PRODUCTION ECONOMICS, 2014, 156 : 305 - 315
  • [10] Towards Risk-Averse Edge Computing With Deep Reinforcement Learning
    Xu, Dianlei
    Su, Xiang
    Wang, Huandong
    Tarkoma, Sasu
    Hui, Pan
    IEEE TRANSACTIONS ON MOBILE COMPUTING, 2024, 23 (06) : 7030 - 7047