Risk-averse chain via robust reinforcement

被引:1
作者
Wang, Jing [1 ]
Swartz, Christopher L. E. [2 ]
Huang, Kai [3 ]
机构
[1] McMaster Univ, Sch Computat Sci & Engn, 1280 Main St West, Hamilton, ON L8S 4K1, Canada
[2] McMaster Univ, Dept Chem Engn, 1280 Main St West, Hamilton, ON L8S 4L7, Canada
[3] McMaster Univ, DeGroote Sch Business, 1280 Main St West, Hamilton, ON L8S 4M4, Canada
关键词
Supply chain management; Reinforcement learning; Risk management; Worst-case criterion; Closed-loop supply chain; Supply chain simulation; SUPPLY-CHAIN; ORDERING MANAGEMENT; INVENTORY CONTROL; PROCESS SYSTEMS; BIG DATA; OPTIMIZATION; UNCERTAINTY; MODEL; NETWORK;
D O I
10.1016/j.compchemeng.2024.108912
中图分类号
TP39 [计算机的应用];
学科分类号
081203 ; 0835 ;
摘要
Classical reinforcement learning (RL) may suffer performance degradation when the environment deviates from training conditions, limiting its application in risk-averse supply chain management. This work explores using robust RL in supply chain operations to hedge against environment inconsistencies and changes. Two robust RL algorithms, Q-learning and beta-pessimistic Q-learning, are examined against conventional Q-learning and a baseline order-up-to inventory policy. Furthermore, this work extends RL applications from forward to closed-loop supply chains. Two case studies are conducted using a supply chain simulator developed with agent-based modeling. The results show that Q-learning can outperform the baseline policy under normal conditions, but notably degrades under environment deviations. By comparison, the robust RL models tend to make more conservative inventory decisions to avoid large shortage penalties. Specifically, fine-tuned beta-pessimistic Q-learning can achieve good performance under normal conditions and maintain robustness against moderate environment inconsistencies, making it suitable for risk-averse decision-making.
引用
收藏
页数:17
相关论文
共 50 条
  • [21] Risk-Averse Production Planning
    Kawas, Ban
    Laumanns, Marco
    Pratsini, Eleni
    Prestwich, Steve
    ALGORITHMIC DECISION THEORY, 2011, 6992 : 108 - +
  • [22] Risk-Averse Selfish Routing
    Lianeas, Thanasis
    Nikolova, Evdokia
    Stier-Moses, Nicolas E.
    MATHEMATICS OF OPERATIONS RESEARCH, 2019, 44 (01) : 38 - 57
  • [23] The optimal order decisions of a risk-averse newsvendor under backlogging
    Zhang, Jianghua
    Chan, Felix T. S.
    Xu, Xinsheng
    ANNALS OF OPERATIONS RESEARCH, 2023, 329 (1-2) : 225 - 247
  • [24] Risk-Averse Home Energy Management System
    Ali, Saqib
    Malik, Tahir Nadeem
    Raza, Aamir
    IEEE ACCESS, 2020, 8 : 91779 - 91798
  • [25] Risk-averse algorithmic support and inventory management
    Narayanan, Pranadharthiharan
    Somasundaram, Jeeva
    Seifert, Matthias
    EUROPEAN JOURNAL OF OPERATIONAL RESEARCH, 2025, 322 (03) : 993 - 1004
  • [26] A robust optimization approach for risk-averse energy transactions in networked microgrids
    Wang, Luhao
    Li, Qiqiang
    Cheng, Xingong
    He, Guixiong
    Li, Guanguan
    Wang, Rui
    INNOVATIVE SOLUTIONS FOR ENERGY TRANSITIONS, 2019, 158 : 6595 - 6600
  • [27] Robust Strategy against Unknown Risk-averse Attackers in Security Games
    Qian, Yundi
    Haskell, William B.
    Tambe, Milind
    PROCEEDINGS OF THE 2015 INTERNATIONAL CONFERENCE ON AUTONOMOUS AGENTS & MULTIAGENT SYSTEMS (AAMAS'15), 2015, : 1341 - 1349
  • [28] Pricing and Coordination Strategy in a Green Supply Chain with a Risk-Averse Retailer
    Wang, Liyan
    Ye, Minghai
    Ma, Shanshan
    Sha, Yipeng
    MATHEMATICAL PROBLEMS IN ENGINEERING, 2019, 2019
  • [29] Pareto analysis of coordinating policies on a supply chain with a risk-averse retailer
    Tian, Yu
    Huang, Dao
    Fifth Wuhan International Conference on E-Business, Vols 1-3: INTEGRATION AND INNOVATION THROUGH MEASUREMENT AND MANAGEMENT, 2006, : 2262 - 2268
  • [30] Stable and Coordinating Contracts for a Supply Chain with Multiple Risk-Averse Suppliers
    Chen, Xin
    Shum, Stephen
    Simchi-Levi, David
    PRODUCTION AND OPERATIONS MANAGEMENT, 2014, 23 (03) : 379 - 392