Using Reinforcement Learning to Control Traffic Signals in a Real-World Scenario: An Approach Based on Linear Function Approximation

被引：20

作者：

Alegre, Lucas N. ^{[1
]}

Ziemke, Theresa ^{[2
]}

Bazzan, Ana L. C. ^{[1
]}

机构：

[1] Univ Fed Rio Grande do Sul, Inst Informat, BR-91501970 Porto Alegre, RS, Brazil

[2] Tech Univ Berlin, Transport Syst Planning & Transport Telemat Dept, D-10623 Berlin, Germany

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2022年 / 23卷 / 07期

关键词：

Traffic signal control; reinforcement learning; function approximation; multiagent systems; SIMULATION; NETWORK;

D O I：

10.1109/TITS.2021.3091014

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Reinforcement learning is an efficient, widely used machine learning technique that performs well in problems with a reasonable number of states and actions. This is rarely the case regarding control-related problems, as for instance controlling traffic signals, where the state space can be very large. One way to deal with the curse of dimensionality is to use generalization techniques such as function approximation. In this paper, a linear function approximation is used by traffic signal agents in a network of signalized intersections. Specifically, a true online SARSA(lambda) algorithm with Fourier basis functions (TOS(lambda)-FB) is employed. This method has the advantage of having convergence guarantees and error bounds, a drawback of non-linear function approximation. In order to evaluate TOS(lambda)-FB, we perform experiments in variations of an isolated intersection scenario and a scenario of the city of Cottbus, Germany, with 22 signalized intersections, implemented in MATSim. We compare our results not only to fixed-time controllers, but also to a state-of-the-art rule-based adaptive method, showing that TOS(lambda)-FB shows a performance that is highly superior to the fixed-time, while also being at least as efficient as the rule-based approach. For more than half of the intersections, our approach leads to less congestion and delay, without the need for the knowledge that underlies the rule-based approach.

引用

页码：9126 / 9135

页数：10

共 45 条

[1] Hierarchical control of traffic signals using Q-learning with tile coding [J].

Abdoos, Monireh ;

Mozayani, Nasser ;

Bazzan, Ana L. C. .

APPLIED INTELLIGENCE, 2014, 40 (02) :201-213

[2]

[Anonymous], 2010, 1009019 SANT FE I

[3] Adaptive traffic signal control with actor-critic methods in a real-world traffic network with different traffic disruption events [J].

Aslani, Mohammad ;

Mesgari, Mohammad Saadi ;

Wiering, Marco .

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2017, 85 :732-752

[4]

Axhausen K. W., 2016, The multi-agent transport simulation MATSim, DOI DOI 10.5334/BAW

[5]

Baird L., 1995, Machine Learning. Proceedings of the Twelfth International Conference on Machine Learning, P30

[6] Urban traffic signal control using reinforcement learning agents [J].

Balaji, P. G. ;

German, X. ;

Srinivasan, D. .

IET INTELLIGENT TRANSPORT SYSTEMS, 2010, 4 (03) :177-188

[7] Opportunities for multiagent systems and multiagent reinforcement learning in traffic control [J].

Bazzan, Ana L. C. .

AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2009, 18 (03) :342-375

[8]

Chang E. C. P., 1988, 467 TEX TRANSP I

[9] Multi-agent model predictive control of signaling split in urban traffic networks [J].

de Oliveira, Lucas Barcelos ;

Camponogara, Eduardo .

TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2010, 18 (01) :120-139

[10] A multivariable regulator approach to traffic-responsive network-wide signal control [J].

Diakaki, C ;

Papageorgiou, M ;

Aboudolas, K .

CONTROL ENGINEERING PRACTICE, 2002, 10 (02) :183-195

← 1 2 3 4 5 →