Traffic signal optimization through discrete and continuous reinforcement learning with robustness analysis in downtown Tehran

被引:34
作者
Aslani, Mohammad [1 ]
Seipel, Stefan [1 ,2 ]
Mesgari, Mohammad Saadi [3 ]
Wiering, Marco [4 ]
机构
[1] Univ Gavle, Dept Ind Dev IT & Land Management, Gavle, Sweden
[2] Uppsala Univ, Div Visual Informat & Interact, Dept Informat Technol, Uppsala, Sweden
[3] KN Toosi Univ Technol, Fac Geodesy & Geomat Engn, Tehran, Iran
[4] Univ Groningen, Inst Artificial Intelligence & Cognit Engn, Groningen, Netherlands
关键词
Reinforcement learning; System disturbances; Traffic signal control; Microscopic traffic simulation; LIGHT CONTROL; ALGORITHMS; EXPLORATION; PEDESTRIANS; NETWORKS; DESIGN; SYSTEM; MODEL;
D O I
10.1016/j.aei.2018.08.002
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Traffic signal control plays a pivotal role in reducing traffic congestion. Traffic signals cannot be adequately controlled with conventional methods due to the high variations and complexity in traffic environments. In recent years, reinforcement learning (RL) has shown great potential for traffic signal control because of its high adaptability, flexibility, and scalability. However, designing RL-embedded traffic signal controllers (RLTSCs) for traffic systems with a high degree of realism is faced with several challenges, among others system disturbances and large state-action spaces are considered in this research. The contribution of the present work is founded on three features: (a) evaluating the robustness of different RLTSCs against system disturbances including incidents, jaywalking, and sensor noise, (b) handling a high dimensional state-action space by both employing different continuous state RL algorithms and reducing the state-action space in order to improve the performance and learning speed of the system, and (c) presenting a detailed empirical study of traffic signals control of downtown Tehran through seven RL algorithms: discrete state Q-leaming(lambda), SARSA(lambda), actor-critic(lambda), continuous state Q-learning(lambda), SARSA(lambda), actor-critic(lambda), and residual actor-critic(lambda). In this research, first a real-world microscopic traffic simulation of downtown Tehran is carried out, then four experiments are performed in order to find the best RLTSC with convincing robustness and strong performance. The results reveal that the RLTSC based on continuous state actor-critic(lambda) has the best performance. In addition, it is found that the best RLTSC leads to saving average travel time by 22% (at the presence of high system disturbances) when it is compared with an optimized fixed-time controller.
引用
收藏
页码:639 / 655
页数:17
相关论文
共 86 条
[1]   Reinforcement learning: Introduction to theory and potential for transport applications [J].
Abdulhai, B ;
Kattan, L .
CANADIAN JOURNAL OF CIVIL ENGINEERING, 2003, 30 (06) :981-991
[2]   Reinforcement learning for structural control [J].
Adam, Bernard ;
Smith, Ian F. C. .
JOURNAL OF COMPUTING IN CIVIL ENGINEERING, 2008, 22 (02) :133-139
[3]   Active tensegrity: A control framework for an adaptive civil-engineering structure [J].
Adam, Bernard ;
Smith, Ian F. C. .
COMPUTERS & STRUCTURES, 2008, 86 (23-24) :2215-2223
[4]  
Albus J. S., 1975, Transactions of the ASME. Series G, Journal of Dynamic Systems, Measurement and Control, V97, P220, DOI 10.1115/1.3426922
[5]  
[Anonymous], 2000, P MACHINE LEARNING
[6]  
[Anonymous], 2013, ADV APPL SELF ORG SY, DOI [10.1007/978-1-4471-5113-5_3, DOI 10.1007/978-1-4471-5113-5_3]
[7]  
[Anonymous], 2015, Reinforcement Learning: An Introduction
[8]   Influence of meta-heuristic optimization on the performance of adaptive interval type2-fuzzy traffic signal controllers [J].
Araghi, Sahar ;
Khosravi, Abbas ;
Creighton, Douglas ;
Nahavandi, Saeid .
EXPERT SYSTEMS WITH APPLICATIONS, 2017, 71 :493-503
[9]  
Asgharpour S. E., 2013, 3 INT GEOGR S GEOMED, P6
[10]  
Aslani M., 2018, P I CIVIL ENG-TRANSP, DOI [10.1680/jtran.17.0008510.1680/jtran.17, DOI 10.1680/JTRAN.17.0008510.1680/JTRAN.17]