Threshold Tuning Using Stochastic Optimization for Graded Signal Control

被引：31

作者：

Prashanth, L. A. ^{[1
]}

Bhatnagar, Shalabh ^{[1
]}

机构：

[1] Indian Inst Sci, Dept Comp Sci & Automat, Bangalore 560012, Karnataka, India

来源：

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY | 2012年 / 61卷 / 09期

关键词：

Deterministic perturbation sequences; intelligent transportation systems; simultaneous perturbation stochastic approximation (SPSA); stochastic optimization; threshold tuning; traffic signal control; TRAFFIC SIGNALS; REAL-TIME; APPROXIMATION; SYSTEM; NETWORKS;

D O I：

10.1109/TVT.2012.2209904

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Adaptive control of traffic lights is a key component of any intelligent transportation system. Many real-time traffic light control (TLC) algorithms are based on graded thresholds, because precise information about the traffic congestion in the road network is hard to obtain in practice. For example, using thresholds L-1 and L-2, we could mark the congestion level on a particular lane as "low," "medium," or "high" based on whether the queue length on the lane is below L-1, between L-1 and L-2, or above L-2, respectively. However, the TLC algorithms that were proposed in the literature incorporate fixed values for the thresholds, which, in general, are not optimal for all traffic conditions. In this paper, we present an algorithm based on stochastic optimization to tune the thresholds that are associated with a TLC algorithm for optimal performance. We also propose the following three novel TLC algorithms: 1) a full-state Q-learning algorithm with state aggregation, 2) a Q-learning algorithm with function approximation that involves an enhanced feature selection scheme, and 3) a priority-based TLC scheme. All these algorithms are threshold based. Next, we combine the threshold-tuning algorithm with the three aforementioned algorithms. Such a combination results in several interesting consequences. For example, in the case of Q-learning with full-state representation, our threshold-tuning algorithm suggests an optimal way of clustering states to reduce the cardinality of the state space, and in the case of the Q-learning algorithm with function approximation, our (threshold-tuning) algorithm provides a novel feature adaptation scheme to obtain an "optimal" selection of features. Our tuning algorithm is an incremental-update online scheme with proven convergence to the optimal values of thresholds. Moreover, the additional computational effort that is required because of the integration of the tuning scheme in any of the graded-threshold-based TLC algorithms is minimal. Simulation results show a significant gain in performance when our threshold-tuning algorithm is used in conjunction with various TLC algorithms compared to the original TLC algorithms without tuning and with fixed thresholds.

引用

页码：3865 / 3880

页数：16

共 37 条

[1] Reinforcement learning for True Adaptive traffic signal control [J].

Abdulhai, B ;

Pringle, R ;

Karakoulas, GJ .

JOURNAL OF TRANSPORTATION ENGINEERING, 2003, 129 (03) :278-285

[2]

[Anonymous], 1995, Phys

[3]

[Anonymous], 1996, Neuro-dynamic programming

[4]

Bhatnagar S., 2005, ACM Transactions on Modeling and Computer Simulation, V15, P74, DOI 10.1145/1044322.1044326

[5] Adaptive Newton-based multivariate smoothed functional algorithms for simulation optimization [J].

Bhatnagar, Shalabh .

ACM TRANSACTIONS ON MODELING AND COMPUTER SIMULATION, 2008, 18 (01)

[6]

Borkar VS., 2009, Stochastic Approximation: A Dynamical Systems Viewpoint

[7]

Cools SB, 2008, ADV INFORM KNOWL PRO, P41, DOI 10.1007/978-1-84628-982-8_3

[8]

Gartner N. H., 1983, OPAC DEMAND RESPONSI

[9] Using genetic algorithms to design signal coordination for oversaturated networks [J].

Girianna, M ;

Benekohal, RF .

ITS JOURNAL, 2004, 8 (02) :117-129

[10]

Henry J. J., 1984, Control in Transportation Systems. Proceedings of the 4th IFAC/IFIP/IFORS Conference, P305

← 1 2 3 4 →