IHG-MA: Inductive heterogeneous graph multi-agent reinforcement learning for multi-intersection traffic signal control

被引:48
作者
Yang, Shantian [1 ]
Yang, Bo [1 ]
Kang, Zhongfeng [1 ]
Deng, Lihui [1 ]
机构
[1] Univ Elect Sci & Technol China, Sch Comp Sci & Engn, Chengdu 611731, Sichuan, Peoples R China
基金
中国国家自然科学基金;
关键词
Heterogeneous graph neural network; Inductive heterogeneous graph representation learning; Multi-agent reinforcement learning; Transfer learning; Cooperative traffic signal control;
D O I
10.1016/j.neunet.2021.03.015
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Multi-agent deep reinforcement learning (MDRL) has been widely applied in multi-intersection traffic signal control. The MDRL algorithms produce the decentralized cooperative traffic-signal policies via specialized multi-agent settings in certain traffic networks. However, the state-of-the-art MDRL algorithms seem to have some drawbacks. (1) It is desirable that the traffic-signal policies can be smoothly transferred to diverse traffic networks, however, the adopted specialized multi-agent settings hinder the traffic-signal policies to transfer and generalize to new traffic networks. (2) Existing MDRL algorithms which are based on deep neural networks cannot flexibly tackle a time-varying number of vehicles traversing the traffic networks. (3) Existing MDRL algorithms which are based on homogeneous graph neural networks fail to capture the heterogeneous features of objects in traffic networks. Motivated by the above observations, in this paper, we propose an algorithm, referred to as Inductive Heterogeneous Graph Multi-agent Actor-critic (IHG-MA) algorithm, for multi-intersection traffic signal control. The proposed IHG-MA algorithm has two features: (1) It conducts representation learning using a proposed inductive heterogeneous graph neural network (IHG), which is an inductive algorithm. The proposed IHG algorithm can generate embeddings for previously unseen nodes (e.g., new entry vehicles) and new graphs (e.g., new traffic networks). But unlike the algorithms based on the homogeneous graph neural network, IHG algorithm not only encodes heterogeneous features of each node, but also encodes heterogeneous structural (graph) information. (2) It also conducts policy learning using a proposed multi-agent actor-critic (MA), which is a decentralized cooperative framework. The proposed MA framework employs the final embeddings to compute the Q-value and policy, and then optimizes the whole algorithm via the Q-value and policy loss. Experimental results on different traffic datasets illustrate that IHG-MA algorithm outperforms the state-of-the-art algorithms in terms of multiple traffic metrics, which seems to be a new promising algorithm for multi-intersection traffic signal control. (C) 2021 Elsevier Ltd. All rights reserved.
引用
收藏
页码:265 / 277
页数:13
相关论文
共 57 条
[1]  
Abdoos M, 2011, IEEE INT C INTELL TR, P1580, DOI 10.1109/ITSC.2011.6083114
[2]   Reinforcement learning for True Adaptive traffic signal control [J].
Abdulhai, B ;
Pringle, R ;
Karakoulas, GJ .
JOURNAL OF TRANSPORTATION ENGINEERING, 2003, 129 (03) :278-285
[3]   Reinforcement learning-based multi-agent system for network traffic signal control [J].
Arel, I. ;
Liu, C. ;
Urbanik, T. ;
Kohls, A. G. .
IET INTELLIGENT TRANSPORT SYSTEMS, 2010, 4 (02) :128-135
[4]  
Arne K, 2013, TRAFFIC FLOW DYNAMIC, P1928
[5]   Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events [J].
Aslani, Mohammad ;
Mesgari, Mohammad Saadi ;
Wiering, Marco .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2017, 85 :732-752
[6]  
Bengio Y, 2014, 28 C NEUR INF PROC S
[7]   Impact of congestion on greenhouse gas emissions for road transport in Mumbai metropolitan region [J].
Bharadwaj, Shashank ;
Ballare, Sudheer ;
Rohit ;
Chandel, Munish K. .
WORLD CONFERENCE ON TRANSPORT RESEARCH - WCTR 2016, 2017, 25 :3542-3555
[8]  
Casanova A, 2018, 6 INT C LEARN REPR
[9]   Event-based fuzzy control for T-S fuzzy networked systems with various data missing [J].
Chen, Ziran ;
Zhang, Baoyong ;
Stojanovic, Vladimir ;
Zhang, Yijun ;
Zhang, Zhengqiang .
NEUROCOMPUTING, 2020, 417 :322-332
[10]   Asynchronous fault detection filtering for piecewise homogenous Markov jump linear systems via a dual hidden Markov model [J].
Cheng, Peng ;
Chen, Mengyuan ;
Stojanovic, Vladimir ;
He, Shuping .
MECHANICAL SYSTEMS AND SIGNAL PROCESSING, 2021, 151 (151)