MonitorLight: Reinforcement Learning-based Traffic Signal Control Using Mixed Pressure Monitoring

被引:10
作者
Fang, Zekuan [1 ]
Zhang, Fan [1 ]
Wang, Ting [1 ]
Lian, Xiang [2 ]
Chen, Mingsong [1 ]
机构
[1] East China Normal Univ, MoE Engn Res Ctr SW HW Codesign Tech & App, Shanghai, Peoples R China
[2] Kent State Univ, Dept Comp Sci, Kent, OH 44242 USA
来源
PROCEEDINGS OF THE 31ST ACM INTERNATIONAL CONFERENCE ON INFORMATION AND KNOWLEDGE MANAGEMENT, CIKM 2022 | 2022年
关键词
Reinforcement learning; Traffic signal control; Phase duration; Average travel time; Fairness; SYSTEM;
D O I
10.1145/3511808.3557400
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Although Reinforcement Learning (RL) has achieved significant success in the Traffic Signal Control (TSC), most of them focus on the design of RL elements while the impact of the phase duration is neglected. Due to the lack of exploring dynamic phase duration, the overall performance and convergence rate of RL-based TSC approaches cannot be guaranteed, which may result in poor adaptability of RL methods to different traffic conditions. To address these issues, in this paper, we formulate a novel phase-duration-aware TSC (PDA-TSC) problem and propose an effective RL-based TSC approach, named MonitorLight. Our approach adopts a new traffic indicator, mixed pressure, which enables RL agents to simultaneously analyze the impacts of stationary and moving vehicles on intersections. Based on the observed mixed pressure of intersections, RL agents can autonomously determine whether or not to change the current signals in real-time. In addition, MonitorLight can adjust the control method for scenarios with different real-time requirements and achieve excellent results in different situations. Extensive experiments on both real-world and synthetic datasets demonstrate that MonitorLight outperforms the current state-of-the-art IPDALight by up to 2.84% and 5.71% in average vehicle travel time, respectively. Moreover, our method significantly speeds up the convergence, leading IPDALight by 36.87% and 34.58% in the start to converge episode and jumpstart performance, respectively.
引用
收藏
页码:478 / 487
页数:10
相关论文
共 28 条
[1]   Holonic multi-agent system for traffic signals control [J].
Abdoos, Monireh ;
Mozayani, Nasser ;
Bazzan, Ana L. C. .
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (5-6) :1575-1587
[2]   A comprehensive survey on vehicular Ad Hoc network [J].
Al-Sultan, Saif ;
Al-Doori, Moath M. ;
Al-Bayatti, Ali H. ;
Zedan, Hussien .
JOURNAL OF NETWORK AND COMPUTER APPLICATIONS, 2014, 37 :380-392
[3]  
[Anonymous], 2004, Traffic Engineering
[4]   Traffic Signal Control Using Hybrid Action Space Deep Reinforcement Learning [J].
Bouktif, Salah ;
Cheniki, Abderraouf ;
Ouni, Ali .
SENSORS, 2021, 21 (07)
[5]  
Cools SB, 2008, ADV INFORM KNOWL PRO, P41, DOI 10.1007/978-1-84628-982-8_3
[6]  
Gilmer J, 2017, PR MACH LEARN RES, V70
[7]  
Haydari Ammar, 2020, P IEEE T INT TRANSP
[8]  
Hu XR, 2020, Arxiv, DOI arXiv:2009.14627
[9]  
Jiang Qize, 2021, P INT JOINT C ART IN
[10]  
Koonce P., 2008, Traffic Signal Timing Manual