IHG-MA: Inductive heterogeneous graph multi-agent reinforcement learning for multi-intersection traffic signal control

被引:48
作者
Yang, Shantian [1 ]
Yang, Bo [1 ]
Kang, Zhongfeng [1 ]
Deng, Lihui [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Heterogeneous graph neural network; Inductive heterogeneous graph representation learning; Multi-agent reinforcement learning; Transfer learning; Cooperative traffic signal control;
D O I
10.1016/j.neunet.2021.03.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent deep reinforcement learning (MDRL) has been widely applied in multi-intersection traffic signal control. The MDRL algorithms produce the decentralized cooperative traffic-signal policies via specialized multi-agent settings in certain traffic networks. However, the state-of-the-art MDRL algorithms seem to have some drawbacks. (1) It is desirable that the traffic-signal policies can be smoothly transferred to diverse traffic networks, however, the adopted specialized multi-agent settings hinder the traffic-signal policies to transfer and generalize to new traffic networks. (2) Existing MDRL algorithms which are based on deep neural networks cannot flexibly tackle a time-varying number of vehicles traversing the traffic networks. (3) Existing MDRL algorithms which are based on homogeneous graph neural networks fail to capture the heterogeneous features of objects in traffic networks. Motivated by the above observations, in this paper, we propose an algorithm, referred to as Inductive Heterogeneous Graph Multi-agent Actor-critic (IHG-MA) algorithm, for multi-intersection traffic signal control. The proposed IHG-MA algorithm has two features: (1) It conducts representation learning using a proposed inductive heterogeneous graph neural network (IHG), which is an inductive algorithm. The proposed IHG algorithm can generate embeddings for previously unseen nodes (e.g., new entry vehicles) and new graphs (e.g., new traffic networks). But unlike the algorithms based on the homogeneous graph neural network, IHG algorithm not only encodes heterogeneous features of each node, but also encodes heterogeneous structural (graph) information. (2) It also conducts policy learning using a proposed multi-agent actor-critic (MA), which is a decentralized cooperative framework. The proposed MA framework employs the final embeddings to compute the Q-value and policy, and then optimizes the whole algorithm via the Q-value and policy loss. Experimental results on different traffic datasets illustrate that IHG-MA algorithm outperforms the state-of-the-art algorithms in terms of multiple traffic metrics, which seems to be a new promising algorithm for multi-intersection traffic signal control. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页码:265 / 277
页数:13
相关论文
共 57 条
[11]   Multi-Agent Deep Reinforcement Learning for Large-Scale Traffic Signal Control [J].
Chu, Tianshu ;
Wang, Jie ;
Codeca, Lara ;
Li, Zhaojian .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2020, 21 (03) :1086-1095
[12]  
Codeca L., 2018, EPiC Series in Engineering, V2, P43, DOI [10.29007/1zt5,, DOI 10.29007/1ZT5]
[13]  
Dusparic I, 2018, 26 IR C ART INT COGN
[14]  
El-Tantawy S., 2010, 2010 13th International IEEE Conference on Intelligent Transportation Systems (ITSC 2010), P665, DOI 10.1109/ITSC.2010.5625066
[15]  
Foerster J, 2017, PR MACH LEARN RES, V70
[16]  
Genders W., 2016, Using a deep reinforcement learning agent for traffic signal controlJ. arXiv preprint arXiv: 1611.01142, P1, DOI DOI 10.1109/ISIA55826.2022.9993598
[17]   Distributed Geometric Fuzzy Multiagent Urban Traffic Signal Control [J].
Gokulan, Balaji Parasumanna ;
Srinivasan, Dipti .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2010, 11 (03) :714-727
[18]   Including congestion effects in urban road traffic CO2 emissions modelling: Do Local Government Authorities have the right options? [J].
Grote, Matt ;
Williams, Ian ;
Preston, John ;
Kemp, Simon .
TRANSPORTATION RESEARCH PART D-TRANSPORT AND ENVIRONMENT, 2016, 43 :95-106
[19]  
Haarnoja T, 2018, PR MACH LEARN RES, V80
[20]  
Hamilton W.L., 2017, P 31 INT C NEUR INF, P1025