机构:
McMaster Univ, Sch Computat Sci & Engn, 1280 Main St West, Hamilton, ON L8S 4K1, CanadaMcMaster Univ, Sch Computat Sci & Engn, 1280 Main St West, Hamilton, ON L8S 4K1, Canada
Wang, Jing
[1
]
Swartz, Christopher L. E.
论文数: 0引用数: 0
h-index: 0
机构:
McMaster Univ, Dept Chem Engn, 1280 Main St West, Hamilton, ON L8S 4L7, CanadaMcMaster Univ, Sch Computat Sci & Engn, 1280 Main St West, Hamilton, ON L8S 4K1, Canada
Swartz, Christopher L. E.
[2
]
Huang, Kai
论文数: 0引用数: 0
h-index: 0
机构:
McMaster Univ, DeGroote Sch Business, 1280 Main St West, Hamilton, ON L8S 4M4, CanadaMcMaster Univ, Sch Computat Sci & Engn, 1280 Main St West, Hamilton, ON L8S 4K1, Canada
Huang, Kai
[3
]
机构:
[1] McMaster Univ, Sch Computat Sci & Engn, 1280 Main St West, Hamilton, ON L8S 4K1, Canada
[2] McMaster Univ, Dept Chem Engn, 1280 Main St West, Hamilton, ON L8S 4L7, Canada
[3] McMaster Univ, DeGroote Sch Business, 1280 Main St West, Hamilton, ON L8S 4M4, Canada
Classical reinforcement learning (RL) may suffer performance degradation when the environment deviates from training conditions, limiting its application in risk-averse supply chain management. This work explores using robust RL in supply chain operations to hedge against environment inconsistencies and changes. Two robust RL algorithms, Q-learning and beta-pessimistic Q-learning, are examined against conventional Q-learning and a baseline order-up-to inventory policy. Furthermore, this work extends RL applications from forward to closed-loop supply chains. Two case studies are conducted using a supply chain simulator developed with agent-based modeling. The results show that Q-learning can outperform the baseline policy under normal conditions, but notably degrades under environment deviations. By comparison, the robust RL models tend to make more conservative inventory decisions to avoid large shortage penalties. Specifically, fine-tuned beta-pessimistic Q-learning can achieve good performance under normal conditions and maintain robustness against moderate environment inconsistencies, making it suitable for risk-averse decision-making.
机构:
Shenzhen Univ, Sch Management, Dept Management Sci, Shenzhen 518060, Peoples R ChinaSoutheast Univ, Inst Syst Engn, Nanjing 211189, Jiangsu, Peoples R China
Ma, Lijun
Liu, Fangmei
论文数: 0引用数: 0
h-index: 0
机构:
Shenzhen Univ, Sch Management, Dept Management Sci, Shenzhen 518060, Peoples R ChinaSoutheast Univ, Inst Syst Engn, Nanjing 211189, Jiangsu, Peoples R China
Liu, Fangmei
Li, Sijie
论文数: 0引用数: 0
h-index: 0
机构:
Southeast Univ, Inst Syst Engn, Nanjing 211189, Jiangsu, Peoples R ChinaSoutheast Univ, Inst Syst Engn, Nanjing 211189, Jiangsu, Peoples R China
Li, Sijie
Yan, Houmin
论文数: 0引用数: 0
h-index: 0
机构:
Chinese Univ Hong Kong, Dept Syst Engn & Engn Management, Shatin, Hong Kong, Peoples R ChinaSoutheast Univ, Inst Syst Engn, Nanjing 211189, Jiangsu, Peoples R China
机构:
East China Univ Sci & Technol, Dept Math, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Dept Math, Shanghai 200237, Peoples R China
Li, Min
Zhang, Jiahua
论文数: 0引用数: 0
h-index: 0
机构:
South China Univ Technol, Dept Elect Business, Guangzhou 510006, Guangdong, Peoples R ChinaEast China Univ Sci & Technol, Dept Math, Shanghai 200237, Peoples R China
Zhang, Jiahua
Xu, Yifan
论文数: 0引用数: 0
h-index: 0
机构:
Fudan Univ, Sch Management, Shanghai 200433, Peoples R ChinaEast China Univ Sci & Technol, Dept Math, Shanghai 200237, Peoples R China
Xu, Yifan
Wang, Wei
论文数: 0引用数: 0
h-index: 0
机构:
East China Univ Sci & Technol, Dept Math, Shanghai 200237, Peoples R ChinaEast China Univ Sci & Technol, Dept Math, Shanghai 200237, Peoples R China
机构:
Univ Helsinki, Dept Comp Sci, Helsinki 00014, Finland
Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol BNRist, Dept Elect Engn, Beijing 100084, Peoples R ChinaUniv Helsinki, Dept Comp Sci, Helsinki 00014, Finland
Xu, Dianlei
Su, Xiang
论文数: 0引用数: 0
h-index: 0
机构:
Norwegian Univ Sci & Technol, Dept Comp Sci, N-7034 Trondheim, NorwayUniv Helsinki, Dept Comp Sci, Helsinki 00014, Finland
Su, Xiang
Wang, Huandong
论文数: 0引用数: 0
h-index: 0
机构:
Tsinghua Univ, Beijing Natl Res Ctr Informat Sci & Technol BNRist, Dept Elect Engn, Beijing 100084, Peoples R ChinaUniv Helsinki, Dept Comp Sci, Helsinki 00014, Finland
Wang, Huandong
Tarkoma, Sasu
论文数: 0引用数: 0
h-index: 0
机构:
Univ Helsinki, Dept Comp Sci, Helsinki 00014, FinlandUniv Helsinki, Dept Comp Sci, Helsinki 00014, Finland
Tarkoma, Sasu
Hui, Pan
论文数: 0引用数: 0
h-index: 0
机构:
Univ Helsinki, Dept Comp Sci, Helsinki 00014, Finland
Hong Kong Univ Sci & Technol, Dept Comp Sci & Engn, Clear Water Bay, Hong Kong, Peoples R ChinaUniv Helsinki, Dept Comp Sci, Helsinki 00014, Finland