Multi-agent Reinforcement Learning for Traffic Signal Control

被引:0
作者
Prabuchandran, K. J. [1 ]
Kumar, Hemanth A. N. [1 ]
Bhatnagar, Shalabh [1 ]
机构
[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India
来源
2014 IEEE 17TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC) | 2014年
关键词
traffic signal control; multi-agent reinforcement learning; Q-learning; UCB; VISSIM;
D O I
暂无
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Optimal control of traffic lights at junctions or traffic signal control (TSC) is essential for reducing the average delay experienced by the road users amidst the rapid increase in the usage of vehicles. In this paper, we formulate the TSC problem as a discounted cost Markov decision process (MDP) and apply multi-agent reinforcement learning (MARL) algorithms to obtain dynamic TSC policies. We model each traffic signal junction as an independent agent. An agent decides the signal duration of its phases in a round-robin (RR) manner using multi-agent Q-learning with either is an element of-greedy or UCB [3] based exploration strategies. It updates its Q-factors based on the cost feedback signal received from its neighbouring agents. This feedback signal can be easily constructed and is shown to be effective in minimizing the average delay of the vehicles in the network. We show through simulations over VISSIM that our algorithms perform significantly better than both the standard fixed signal timing (FST) algorithm and the saturation balancing (SAT) algorithm [15] over two real road networks.
引用
收藏
页码:2529 / 2534
页数:6
相关论文
共 19 条
[1]  
Abdoos M, 2011, IEEE INT C INTELL TR, P1580, DOI 10.1109/ITSC.2011.6083114
[2]  
[Anonymous], 1998, REINFORCEMENT LEARNI
[3]   Reinforcement learning-based multi-agent system for network traffic signal control [J].
Arel, I. ;
Liu, C. ;
Urbanik, T. ;
Kohls, A. G. .
IET INTELLIGENT TRANSPORT SYSTEMS, 2010, 4 (02) :128-135
[4]   Finite-time analysis of the multiarmed bandit problem [J].
Auer, P ;
Cesa-Bianchi, N ;
Fischer, P .
MACHINE LEARNING, 2002, 47 (2-3) :235-256
[5]  
Bakker B, 2010, STUD COMPUT INTELL, V281, P475
[6]   Opportunities for multiagent systems and multiagent reinforcement learning in traffic control [J].
Bazzan, Ana L. C. .
AUTONOMOUS AGENTS AND MULTI-AGENT SYSTEMS, 2009, 18 (03) :342-375
[7]   Adaptive traffic signal control using approximate dynamic programming [J].
Cai, Chen ;
Wong, Chi Kwong ;
Heydecker, Benjamin G. .
TRANSPORTATION RESEARCH PART C-EMERGING TECHNOLOGIES, 2009, 17 (05) :456-474
[8]  
Cools SB, 2008, ADV INFORM KNOWL PRO, P41, DOI 10.1007/978-1-84628-982-8_3
[9]  
Dai Yujie., 2010, The 2010 International Joint Conference on Neural Networks (IJCNN), P1
[10]  
El-Tantawy S., 2010, 2010 13th International IEEE Conference on Intelligent Transportation Systems (ITSC 2010), P665, DOI 10.1109/ITSC.2010.5625066