SMAC-tuned Deep Q-learning for Ramp Metering

被引：0

作者：

ElSamadisy, Omar ^{[1
,3
]}

Abdulhai, Yazeed ^{[1
]}

Xue, Haoyuan ^{[2
]}

Smirnov, Ilia ^{[1
]}

Khalil, Elias B. ^{[2
]}

Abdulhai, Baher ^{[1
]}

机构：

[1] Univ Toronto, Dept Civil Engn, Toronto, ON, Canada

[2] Univ Toronto, Dept Mech & Ind Engn, Toronto, ON, Canada

[3] Arab Acad Sci Technol & Maritime Transport, Coll Engn & Technol, Dept Elect Commun Engn, Alexandria, Egypt

来源：

2023 IEEE INTERNATIONAL CONFERENCE ON SMART MOBILITY, SM | 2023年

关键词：

Ramp metering; Reinforcement learning; Hyperparameter tuning;

D O I：

10.1109/SM57895.2023.10112246

中图分类号：

TP39 [计算机的应用];

学科分类号：

081203 ; 0835 ;

摘要：

The demand for transportation increases as the population of a city grows, and significant expansion is not conceivable because of spatial, financial, and environmental limitations. As a result, improving infrastructure efficiency is becoming increasingly critical. Ramp metering with deep reinforcement learning (RL) is a method to tackle this problem. However, fine-tuning RL hyperparameters for RM is yet to be explored in the literature, potentially leaving performance improvements on the table. In this paper, the Sequential Model-based Algorithm Configuration (SMAC) method is used to finetune the values of two essential hyperparameters for the deep reinforcement learning ramp metering model, the discount factor and the decay of the explore/exploit ratio. Around 350 experiments with different configurations were run with PySMAC (a python interface to the hyperparameter optimization tool SMAC) and compared to Random search as a baseline. It is found that the best reward discount factor reflects that the RL agent should focus on immediate rewards and not pay much attention to future rewards. On the other hand, the selected value for the exploration ratio decay rate shows that the RL agent should prefer to decrease the exploration rate early. Both random search and SMAC show the same performance improvement of 19

引用

页码：65 / 72

页数：8

共 50 条

[21] Deep Surrogate Q-Learning for Autonomous Driving
Kalweit, Maria
Kalweit, Gabriel
Werling, Moritz
Boedecker, Joschka
2022 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2022), 2022, : 1578 - 1584
[22] Trading ETFs with Deep Q-Learning Algorithm
Hong, Shao-Yan
Liu, Chien-Hung
Chen, Woei-Kae
You, Shingchern D.
2020 IEEE INTERNATIONAL CONFERENCE ON CONSUMER ELECTRONICS - TAIWAN (ICCE-TAIWAN), 2020,
[23] Deep Q-Learning for Aggregator Price Design
Pigott, Aisling
Baker, Kyri
Mosiman, Cory
2021 IEEE POWER & ENERGY SOCIETY GENERAL MEETING (PESGM), 2021,
[24] Diagnosing Bottlenecks in Deep Q-learning Algorithms
Fu, Justin
Kumar, Aviral
Soh, Matthew
Levine, Sergey
INTERNATIONAL CONFERENCE ON MACHINE LEARNING, VOL 97, 2019, 97
[25] NeuroHex: A Deep Q-learning Hex Agent
Young, Kenny
Vasan, Gautham
Hayward, Ryan
COMPUTER GAMES: 5TH WORKSHOP ON COMPUTER GAMES, CGW 2016, AND 5TH WORKSHOP ON GENERAL INTELLIGENCE IN GAME-PLAYING AGENTS, GIGA 2016, HELD IN CONJUNCTION WITH THE 25TH INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE, IJCAI 2016, NEW YORK, USA, JULY 9-10, 2016, 2017, 705 : 3 - 18
[26] QLP: Deep Q-Learning for Pruning Deep Neural Networks
Camci, Efe
Gupta, Manas
Wu, Min
Lin, Jie
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, 2022, 32 (10) : 6488 - 6501
[27] An Optimal Control Method for Expressways Entering Ramps Metering Based on Q-Learning
Ji, Xiaofeng
He, Zenghui
ICICTA: 2009 SECOND INTERNATIONAL CONFERENCE ON INTELLIGENT COMPUTATION TECHNOLOGY AND AUTOMATION, VOL I, PROCEEDINGS, 2009, : 739 - 741
[28] Deep Reinforcement Learning with Sarsa and Q-Learning: A Hybrid Approach
Xu, Zhi-xiong
Cao, Lei
Chen, Xi-liang
Li, Chen-xi
Zhang, Yong-liang
Lai, Jun
IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS, 2018, E101D (09) : 2315 - 2322
[29] Adaptive Learning Recommendation Strategy Based on Deep Q-learning
Tan, Chunxi
Han, Ruijian
Ye, Rougang
Chen, Kani
APPLIED PSYCHOLOGICAL MEASUREMENT, 2020, 44 (04) : 251 - 266
[30] A Deep Reinforcement Learning Approach for Ramp Metering Based on Traffic Video Data
Liu, Bing
Tang, Yu
Ji, Yuxiong
Shen, Yu
Du, Yuchuan
Shen, Yu (yshen@tongji.edu.cn), 1600, Hindawi Limited (2021):

← 1 2 3 4 5 →