Distributed Multiagent Deep Reinforcement Learning for Multiline Dynamic Bus Timetable Optimization

被引：17

作者：

Yan, Haoyang ^{[1
,2
]}

Cui, Zhiyong ^{[3
]}

Chen, Xinqiang ^{[4
]}

Ma, Xiaolei ^{[1
,2
]}

机构：

[1] Beihang Univ, Beijing Key Lab Cooperat Vehicle Infrastruct Syst, Sch Transportat Sci & Engn, Beijing 100191, Peoples R China

[2] Beihang Univ, Beijing Adv Innovat Ctr Big Data & Brain Comp, Beijing 100191, Peoples R China

[3] Univ Washington, Dept Civil & Environm Engn, Seattle, WA 98195 USA

[4] Shanghai Maritime Univ, Inst Logist Sci & Engn, Shanghai 201306, Peoples R China

来源：

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS | 2023年 / 19卷 / 01期

基金：

国家重点研发计划;

关键词：

Optimization; Costs; Vehicle dynamics; Heuristic algorithms; Games; Informatics; Data models; Bus timetable optimization; deep reinforcement learning (RL); distributed computing; multiagent; public transit; RESOURCE; SYSTEMS; LEVEL;

D O I：

10.1109/TII.2022.3158651

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

As a primary countermeasure to mitigate traffic congestion and air pollution, promoting public transit has become a global census. Designing a robust and reliable bus timetable is a pivotal step to increase ridership and reduce operating cost for transit authorities. However, most previous studies on bus timetabling rely on historical passenger count and travel time data to generate static schedules, which often yield biased results in these uncertain scenarios, such as demand surge or adverse weather. In addition, acquiring real-time passenger origin/destination from a limited number of running buses is not feasible. This article considers the multiline dynamic bus timetable optimization problem as a Markov decision process model to address the aforementioned issues, and proposes a multiagent deep reinforcement learning framework to ensure effective learning from the imperfect-information game, where the passenger demand and traffic condition are not always known in advance. Moreover, a distributed reinforcement learning algorithm is applied to overcome the limitation of high computational cost and low efficiency. A case study of multiple bus lines in Beijing, China, confirms the effectiveness and efficiency of the proposed model. The results demonstrate that our method outperforms heuristic and state-of-the-art reinforcement learning algorithms by reducing 20.30% of operating and passenger costs compared with actual timetables.

引用

页码：469 / 479

页数：11

共 30 条

[1]

Ceder A, 2015, PUBLIC TRANSIT PLANNING AND OPERATION: MODELING, PRACTICE AND BEHAVIOR, 2ND EDITION, P1

[2] OPTIMAL SCHEDULING OF URBAN TRANSIT SYSTEMS USING GENETIC ALGORITHMS [J].