Multi-Agent Deep Reinforcement Learning for Multi-Echelon Inventory Management

被引:0
|
作者
Liu, Xiaotian [1 ]
Hu, Ming [2 ]
Peng, Yijie [3 ]
Yang, Yaodong [4 ]
机构
[1] Peking Univ, Guanghua Sch Management, Beijing, Peoples R China
[2] Univ Toronto, Rotman Sch Management, Toronto, ON M5S 3E6, Canada
[3] Peking Univ, PKU Wuhan Inst Artificial Intelligence, Guanghua Sch Management, Xiangjiang Lab, Beijing, Peoples R China
[4] Peking Univ, Inst Artificial Intelligence, PKU Wuhan Inst Artificial Intelligence, Beijing, Peoples R China
基金
加拿大自然科学与工程研究理事会; 美国国家科学基金会;
关键词
Multi-Echelon Inventory Management; Multi-Agent Reinforcement Learning; Bullwhip Effect; OPTIMAL POLICIES; OPTIMALITY;
D O I
10.1177/10591478241305863
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
We apply heterogeneous-agent proximal policy optimization (HAPPO), a multi-agent deep reinforcement learning (MADRL) algorithm, to the decentralized multi-echelon inventory management problems in both a serial supply chain and a supply chain network. We also examine whether the upfront-only information-sharing mechanism used in MADRL helps alleviate the bullwhip effect. Our results show that policies constructed by HAPPO achieve lower overall costs than policies constructed by single-agent deep reinforcement learning and other heuristic policies. Also, the application of HAPPO results in a less significant bullwhip effect than policies constructed by single-agent deep reinforcement learning where information is not shared among actors. Somewhat surprisingly, compared to using the overall costs of the system as a minimization target for each actor, HAPPO achieves lower overall costs when the minimization target for each actor is a combination of its own costs and the overall costs of the system. Our results provide a new perspective on the benefit of information sharing inside the supply chain that helps alleviate the bullwhip effect and improve the overall performance of the system. Upfront information sharing and action coordination in model training among actors is essential, with the former even more essential, for improving a supply chain's overall performance when applying MADRL. Neither actors being fully self-interested nor actors being fully system-focused leads to the best practical performance of policies learned and constructed by MADRL. Our results also verify MADRL's potential in solving various multi-echelon inventory management problems with complex supply chain structures and in non-stationary market environments.
引用
收藏
页数:21
相关论文
共 50 条
  • [31] Distributed energy management of multi-area integrated energy system based on multi-agent deep reinforcement learning
    Ding, Lifu
    Cui, Youkai
    Yan, Gangfeng
    Huang, Yaojia
    Fan, Zhen
    INTERNATIONAL JOURNAL OF ELECTRICAL POWER & ENERGY SYSTEMS, 2024, 157
  • [32] Multi-Agent Reinforcement Learning with Reward Delays
    Zhang, Yuyang
    Zhang, Runyu
    Gu, Yuantao
    Li, Na
    LEARNING FOR DYNAMICS AND CONTROL CONFERENCE, VOL 211, 2023, 211
  • [33] Knowledge distillation for portfolio management using multi-agent reinforcement learning
    Chen, Min-You
    Chen, Chiao-Ting
    Huang, Szu-Hao
    ADVANCED ENGINEERING INFORMATICS, 2023, 57
  • [34] Multi-agent reinforcement learning for character control
    Cheng Li
    Levi Fussell
    Taku Komura
    The Visual Computer, 2021, 37 : 3115 - 3123
  • [35] BenchMARL: Benchmarking Multi-Agent Reinforcement Learning
    Bettini, Matteo
    Prorok, Amanda
    Moens, Vincent
    JOURNAL OF MACHINE LEARNING RESEARCH, 2024, 25
  • [36] A Review of Multi-Agent Reinforcement Learning Algorithms
    Liang, Jiaxin
    Miao, Haotian
    Li, Kai
    Tan, Jianheng
    Wang, Xi
    Luo, Rui
    Jiang, Yueqiu
    ELECTRONICS, 2025, 14 (04):
  • [37] Multi-agent reinforcement learning with weak ties☆
    Wang, Huan
    Zhou, Xu
    Kang, Yu
    Xue, Jian
    Yang, Chenguang
    Liu, Xiaofeng
    INFORMATION FUSION, 2025, 118
  • [38] Concept Learning for Interpretable Multi-Agent Reinforcement Learning
    Zabounidis, Renos
    Campbell, Joseph
    Stepputtis, Simon
    Hughes, Dana
    Sycara, Katia
    CONFERENCE ON ROBOT LEARNING, VOL 205, 2022, 205 : 1828 - 1837
  • [39] Learning structured communication for multi-agent reinforcement learning
    Junjie Sheng
    Xiangfeng Wang
    Bo Jin
    Junchi Yan
    Wenhao Li
    Tsung-Hui Chang
    Jun Wang
    Hongyuan Zha
    Autonomous Agents and Multi-Agent Systems, 2022, 36
  • [40] Industrial load management using multi-agent reinforcement learning for rescheduling
    Roesch, Martin
    Linder, Christian
    Bruckdorfer, Christian
    Hohmann, Andrea
    Reinhart, Gunther
    2019 SECOND INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE FOR INDUSTRIES (AI4I 2019), 2019, : 99 - 102