Deep reinforcement learning for traffic signal control with consistent state and reward design approach

被引：41

作者：

Bouktif, Salah ^{[1
]}

Cheniki, Abderraouf ^{[1
]}

Ouni, Ali ^{[2
]}

El-Sayed, Hesham ^{[1
]}

机构：

[1] United Arab Emirates Univ, CIT, Abu Dhabi, U Arab Emirates

[2] Univ Quebec, ETS Montreal, Montreal, PQ, Canada

来源：

KNOWLEDGE-BASED SYSTEMS | 2023年 / 267卷

关键词：

Traffic signal control; Traffic optimization; Reinforcement learning;

D O I：

10.1016/j.knosys.2023.110440

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Intelligent Transportation Systems are essential due to the increased number of traffic congestion problems and challenges nowadays. Traffic Signal Control (TSC) plays a critical role in optimizing the traffic flow and mitigating the congestion within the urban areas. Various research works have been conducted to enhance the behavior of TSCs at intersections and subsequently reduce the traffic congestion. Researchers recently leveraged Deep Learning (DL) and Reinforcement Learning (RL) techniques to optimize TSCs. In RL framework, the agent interacts with surrounding world through states, rewards and actions. The formulation of these key elements is crucial as they impact the way the RL agent behaves and optimizes its policy. However, most of existing frameworks rely on hand-crafted state and reward designs, restricting the RL agent from acting optimally. In this paper, we propose a novel approach to better formulate state and reward definitions in order to boost the performance of the traffic signal controller agent. The intuitive idea is to define both state and reward in a consistent and straightforward manner. We advocate that such a design approach helps achieving training stability and hence provides a rapid convergence to derive best policies. We consider the double deep Q-Network (DDQN) along with prioritized experience replay (PER) for the agent architecture. To evaluate the performance of our approach, we conduct series of simulations using the Simulation of Urban MObility (SUMO) environment. The statistical analysis of our results show that the performance of our proposal outperforms the state-of-the-art state and reward design approaches.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

引用

页数：18

共 43 条

[1] [Anonymous], 2022, UNDERSTAND IMPACT LE
[2] [Anonymous], 2022, RUNNING ABLATION STU
[3] [Anonymous], 2020, SPEED LIMITS COUNTRY
[4] [Anonymous], 2020, WHY VEH AR TEL
[5] [Anonymous], 2018, INRIX SCOREBOARD 201
[6] Reinforcement learning-based multi-agent system for network traffic signal control
Arel, I.
Liu, C.
Urbanik, T.
Kohls, A. G.
[J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2010, 4 (02) : 128 - 135
[7] Optimizing hyperparameters of deep reinforcement learning for autonomous driving based on whale optimization algorithm
Ashraf, Nesma M.
Mostafa, Reham R.
Sakr, Rasha H.
Rashad, M. Z.
[J]. PLOS ONE, 2021, 16 (06):
[8] Urban traffic signal control using reinforcement learning agents
Balaji, P. G.
German, X.
Srinivasan, D.
[J]. IET INTELLIGENT TRANSPORT SYSTEMS, 2010, 4 (03) : 177 - 188
[9] Behrisch Michael, 2011, P INT C ADV SYST SIM
[10] Bouktif Salah, 2021, 2021 4th International Conference on Artificial Intelligence and Big Data (ICAIBD), P253, DOI 10.1109/ICAIBD51990.2021.9459029

← 1 2 3 4 5 →