Urban Traffic Control in Software Defined Internet of Things via a Multi-Agent Deep Reinforcement Learning Approach

被引：159

作者：

Yang, Jiachen ^{[1
]}

Zhang, Jipeng ^{[1
]}

Wang, Huihui ^{[2
]}

机构：

[1] Tianjin Univ, Sch Elect & Informat Engn, Tianjin 300072, Peoples R China

[2] Jacksonville Univ, Dept Engn, Jacksonville, FL 32211 USA

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2021年 / 22卷 / 06期

基金：

中国国家自然科学基金;

关键词：

Machine learning; Feature extraction; Switches; Software; Internet of Things; Protocols; Urban traffic control; software defined internet of things; multi-agent deep reinforcement learning; modified proximal policy optimization; SIGNAL CONTROL; SYSTEM; LEVEL;

D O I：

10.1109/TITS.2020.3023788

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

As the growth of vehicles and the acceleration of urbanization, the urban traffic congestion problem becomes a burning issue in our society. Constructing a software defined Internet of things(SD-IoT) with a proper traffic control scheme is a promising solution for this issue. However, existing traffic control schemes do not make the best of the advances of the multi-agent deep reinforcement learning area. Furthermore, existing traffic congestion solutions based on deep reinforcement learning(DRL) only focus on controlling the signal of traffic lights, while ignore controlling vehicles to cooperate traffic lights. So the effect of urban traffic control is not comprehensive enough. In this article, we propose Modified Proximal Policy Optimization (Modified PPO) algorithm. This algorithm is ideally suited as the traffic control scheme of SD-IoT. We adaptively adjust the clip hyperparameter to limit the bound of the distance between the next policy and the current policy. What's more, based on the collected data of SD-IoT, the proposed algorithm controls traffic lights and vehicles in a global view to advance the performance of urban traffic control. Experimental results under different vehicle numbers show that the proposed method is more competitive and stable than the original algorithm. Our proposed method improves the performance of SD-IoT to relieve traffic congestion.

引用

页码：3742 / 3754

页数：13

共 59 条

[1] Assessment of self-learning adaptive traffic signal control on congested urban areas: independent versus coordinated perspectives [J].

Abdelgawad, Hossam ;

Abdulhai, Baher ;

El-Tantawy, Samah ;

Hadayeghi, Alireza ;

Zvaniga, Brue .

CANADIAN JOURNAL OF CIVIL ENGINEERING, 2015, 42 (06) :353-366

[2] Holonic multi-agent system for traffic signals control [J].

Abdoos, Monireh ;

Mozayani, Nasser ;

Bazzan, Ana L. C. .

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE, 2013, 26 (5-6) :1575-1587

[3]

Barth-Maron G., 2018, INT C LEARN REPR ICL

[4]

Berde P, 2014, PROC ACM HOTSDN

[5] Programming Protocol-Independent Packet Processors [J].

Bosshart, Pat ;

Daly, Dan ;

Gibb, Glen ;

Izzard, Martin ;

McKeown, Nick ;

Rexford, Jennifer ;

Schlesinger, Cole ;

Talayco, Dan ;

Vahdat, Amin ;

Varghese, George ;

Walker, David .

ACM SIGCOMM COMPUTER COMMUNICATION REVIEW, 2014, 44 (03) :87-95

[6] MobileFaceNets: Efficient CNNs for Accurate Real-Time Face Verification on Mobile Devices [J].

Chen, Sheng ;

Liu, Yang ;

Gao, Xiang ;

Han, Zhen .

BIOMETRIC RECOGNITION, CCBR 2018, 2018, 10996 :428-438

[7]

Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171

[8]

Fujimoto S, 2018, PR MACH LEARN RES, V80

[9]

Gao J., 2017, Adaptive traffic signal control: Deep reinforcement learning algorithm with experience replay and target network, P1

[10]

Ghobadi M, 2012, PROCEEDINGS OF THE 11TH ACM WORKSHOP ON HOT TOPICS IN NETWORKS (HOTNETS-XI), P61

← 1 2 3 4 5 6 →