CLlight: Enhancing representation of multi-agent reinforcement learning with contrastive learning for cooperative traffic signal control

被引:2
作者
Fu, Xiang [1 ,2 ]
Ren, Yilong [1 ,2 ,3 ,7 ]
Jiang, Han [1 ,2 ]
Lv, Jiancheng [4 ,5 ,6 ]
Cui, Zhiyong [1 ,2 ]
Yu, Haiyang [1 ,2 ,3 ,7 ]
机构
[1] Beihang Univ, State Key Lab Intelligent Transportat Syst, Beijing 100191, Peoples R China
[2] Beihang Univ, Sch Transportat Sci & Engn, Beijing 100083, Peoples R China
[3] Zhongguancun Lab, Beijing 100094, Peoples R China
[4] Anhui Keli Informat Ind Co Ltd, Hefei 230000, Peoples R China
[5] Minist Publ Secur Peoples Republ China, Key Lab Urban ITS Technol Optimizat & Integrat, Hefei 230000, Peoples R China
[6] Key Lab intelligent transportat Anhui Prov, Hefei 230000, Peoples R China
[7] Beihang Univ, Hefei Innovat Res Inst, Anhui 230013, Peoples R China
关键词
Traffic signal control; Contrastive learning; Multi-agent reinforcement learning; Masking strategy; GRAPH; SYSTEM;
D O I
10.1016/j.eswa.2024.125578
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent reinforcement learning has shown great potential for coordinating multi-intersection traffic signals due to its powerful adaptive capabilities, treating each intersection as an agent. However, in the real world, different intersections possess differentiating characteristics such as unique vehicle distributions and traffic patterns. Most existing methods directly add neighboring intersection states to local intersections and optimize the cooperative policy network based on synthesized global features. This indirect optimization approach makes it difficult to thoroughly explore the mutual interactions among different intersection agents, preventing agents from truly learning features with cooperative awareness. To resolve these challenges, we introduce contrastive learning as representation task to the multi-intersection traffic signal control approach named CLlight for two- stage policy network updating. In the first stage, we utilize policy-based or actor-critic-based reinforcement learning methods such as A2C, SAC, and PPO to train policy networks with certain representational capabilities. In the second stage, by extracting pre- and post-masked features and reconstructing the post-masked features, the agents are encouraged to learn the similarities and differences between different intersection policies, which in turn enhances the cooperative and individual representation capabilities of the policy network. To the best of our knowledge, this is the first application of contrastive learning in the field of traffic signal control. Experimental results demonstrate, compared to other state-of-the-art traffic signal control methods, superior average travel time and average waiting time performance under various scenarios, tested on synthetic and real-world datasets.
引用
收藏
页数:14
相关论文
共 56 条
[1]   Survey on Self-Supervised Learning: Auxiliary Pretext Tasks and Contrastive Learning Methods in Imaging [J].
Albelwi, Saleh .
ENTROPY, 2022, 24 (04)
[2]   Intelligent Traffic Light Control of Isolated Intersections Using Machine Learning Methods [J].
Araghi, Sahar ;
Khosravi, Abbas ;
Johnstone, Michael ;
Creighton, Doug .
2013 IEEE INTERNATIONAL CONFERENCE ON SYSTEMS, MAN, AND CYBERNETICS (SMC 2013), 2013, :3621-3626
[3]   Reinforcement learning-based multi-agent system for network traffic signal control [J].
Arel, I. ;
Liu, C. ;
Urbanik, T. ;
Kohls, A. G. .
IET INTELLIGENT TRANSPORT SYSTEMS, 2010, 4 (02) :128-135
[4]  
Caron M, 2020, ADV NEUR IN, V33
[5]  
Chen CC, 2020, AAAI CONF ARTIF INTE, V34, P3414
[6]   Deep Multiview Clustering by Contrasting Cluster Assignments [J].
Chen, Jie ;
Mao, Hua ;
Woo, Wai Lok ;
Peng, Xi .
2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, :16706-16715
[7]  
Chen T, 2020, PR MACH LEARN RES, V119
[8]   Traffic signal optimization control method based on adaptive weighted averaged double deep Q network [J].
Chen, Youqing ;
Zhang, Huizhen ;
Liu, Minglei ;
Ye, Ming ;
Xie, Hui ;
Pan, Yubiao .
APPLIED INTELLIGENCE, 2023, 53 (15) :18333-18354
[9]  
Chergui O., 2023, International Journal of Information Technology, P1, DOI DOI 10.1007/S41870-023-01545-8
[10]   Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control [J].
Chu, Tianshu ;
Wang, Jie ;
Codeca, Lara ;
Li, Zhaojian .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (03) :1086-1095