Fair collaborative vehicle routing: A deep multi-agent reinforcement learning approach

被引：4

作者：

Mak, Stephen ^{[1
,4
]}

Xu, Liming ^{[1
]}

Pearce, Tim ^{[2
,5
]}

Ostroumov, Michael ^{[3
]}

Brintrup, Alexandra ^{[1
]}

机构：

[1] Univ Cambridge, Inst Mfg, Dept Engn, Cambridge, England

[2] Microsoft Res Cambridge, Cambridge, England

[3] Value Chain Lab, London, England

[4] 17 Charles Babbage Rd, Cambridge CB3 0FS, England

[5] Tsinghua Univ, Dept Comp Sci & Technol, Beijing, Peoples R China

来源：

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES | 2023年 / 157卷

基金：

英国工程与自然科学研究理事会;

关键词：

Collaborative vehicle routing; Deep multi-agent reinforcement learning; Negotiation; Gain sharing; Multi-agent systems; Machine learning; HORIZONTAL COOPERATION; ALLOCATION; LEVEL; COST; GAME;

D O I：

10.1016/j.trc.2023.104376

中图分类号：

U [交通运输];

学科分类号：

08 ; 0823 ;

摘要：

Collaborative vehicle routing occurs when carriers collaborate through sharing their transporta-tion requests and performing transportation requests on behalf of each other. This achieves economies of scale, thus reducing cost, greenhouse gas emissions and road congestion. But which carrier should partner with whom, and how much should each carrier be compensated? Traditional game theoretic solution concepts are expensive to calculate as the characteristic function scales exponentially with the number of agents. This would require solving the vehicle routing problem (NP-hard) an exponential number of times. We therefore propose to model this problem as a coalitional bargaining game solved using deep multi-agent reinforcement learning, where - crucially - agents are not given access to the characteristic function. Instead, we implicitly reason about the characteristic function; thus, when deployed in production, we only need to evaluate the expensive post-collaboration vehicle routing problem once. Our contribution is that we are the first to consider both the route allocation problem and gain sharing problem simultaneously - without access to the expensive characteristic function. Through decentralised machine learning, our agents bargain with each other and agree to outcomes that correlate well with the Shapley value - a fair profit allocation mechanism. Importantly, we are able to achieve a reduction in run-time of 88%.

引用

页数：25

共 50 条

[1] Multi-agent deep reinforcement learning: a survey
Gronauer, Sven
Diepold, Klaus
ARTIFICIAL INTELLIGENCE REVIEW, 2022, 55 (02) : 895 - 943
[2] Agent Coordination in Air Combat Simulation using Multi-Agent Deep Reinforcement Learning
Kallstrom, Johan
Heintz, Fredrik
2020 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC), 2020, : 2157 - 2164
[3] Multi-agent deep reinforcement learning: a survey
Sven Gronauer
Klaus Diepold
Artificial Intelligence Review, 2022, 55 : 895 - 943
[4] UAV-Assisted Fair Communication for Mobile Networks: A Multi-Agent Deep Reinforcement Learning Approach
Zhou, Yi
Jin, Zhanqi
Shi, Huaguang
Wang, Zhangyun
Lu, Ning
Liu, Fuqiang
REMOTE SENSING, 2022, 14 (22)
[5] Avoiding collaborative paradox in multi-agent reinforcement learning
Kim, Hyunseok
Kim, Seonghyun
Lee, Donghun
Jang, Ingook
ETRI JOURNAL, 2021, 43 (06) : 1004 - 1012
[6] Cooperative Multi-Agent Deep Reinforcement Learning with Counterfactual Reward
Shao, Kun
Zhu, Yuanheng
Tang, Zhentao
Zhao, Dongbin
2020 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS (IJCNN), 2020,
[7] Coordinated multi-agent hierarchical deep reinforcement learning to solve multi-trip vehicle routing problems with soft time windows
Zhang, Zixian
Qi, Geqi
Guan, Wei
IET INTELLIGENT TRANSPORT SYSTEMS, 2023, 17 (10) : 2034 - 2051
[8] A review of cooperative multi-agent deep reinforcement learning
Oroojlooy, Afshin
Hajinezhad, Davood
APPLIED INTELLIGENCE, 2023, 53 (11) : 13677 - 13722
[9] A review of cooperative multi-agent deep reinforcement learning
Afshin Oroojlooy
Davood Hajinezhad
Applied Intelligence, 2023, 53 : 13677 - 13722
[10] Train timetabling with the general learning environment and multi-agent deep reinforcement learning
Li, Wenqing
Ni, Shaoquan
TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2022, 157 : 230 - 251

← 1 2 3 4 5 →