Autonomous Vehicle Platoons in Urban Road Networks: A Joint Distributed Reinforcement Learning and Model Predictive Control Approach

被引:14
作者
D'Alfonso, Luigi [1 ]
Giannini, Francesco [1 ]
Franze, Giuseppe [2 ]
Fedele, Giuseppe [1 ]
Pupo, Francesco [1 ]
Fortino, Giancarlo [1 ]
机构
[1] Univ Calabria, Dept Comp Engn Modeling Elect & Syst, Via Pietro Bucci,Cubo 42-C, I-87036 Arcavacata Di Rende, CS, Italy
[2] Univ Calabria, Dept Mech Engn Energy Engn & Management, Via Pietro Bucci,Cubo 42-C, I-87036 Arcavacata Di Rende, CS, Italy
关键词
Deep learning; Roads; Reinforcement learning; Computer architecture; Predictive models; Routing; Mathematical models; Distributed model predictive control; distributed reinforcement learning; routing decisions; urban road networks; EVENT-TRIGGERED CONTROL; STRING STABILITY; SYSTEMS; INTELLIGENCE; TRACKING;
D O I
10.1109/JAS.2023.123705
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
In this paper, platoons of autonomous vehicles operating in urban road networks are considered. From a methodological point of view, the problem of interest consists of formally characterizing vehicle state trajectory tubes by means of routing decisions complying with traffic congestion criteria. To this end, a novel distributed control architecture is conceived by taking advantage of two methodologies: deep reinforcement learning and model predictive control. On one hand, the routing decisions are obtained by using a distributed reinforcement learning algorithm that exploits available traffic data at each road junction. On the other hand, a bank of model predictive controllers is in charge of computing the more adequate control action for each involved vehicle. Such tasks are here combined into a single framework: the deep reinforcement learning output (action) is translated into a set-point to be tracked by the model predictive controller; conversely, the current vehicle position, resulting from the application of the control move, is exploited by the deep reinforcement learning unit for improving its reliability. The main novelty of the proposed solution lies in its hybrid nature: on one hand it fully exploits deep reinforcement learning capabilities for decision-making purposes; on the other hand, time-varying hard constraints are always satisfied during the dynamical platoon evolution imposed by the computed routing decisions. To efficiently evaluate the performance of the proposed control architecture, a co-design procedure, involving the SUMO and MATLAB platforms, is implemented so that complex operating environments can be used, and the information coming from road maps (links, junctions, obstacles, semaphores, etc.) and vehicle state trajectories can be shared and exchanged. Finally by considering as operating scenario a real entire city block and a platoon of eleven vehicles described by double-integrator models, several simulations have been performed with the aim to put in light the main features of the proposed approach. Moreover, it is important to underline that in different operating scenarios the proposed reinforcement learning scheme is capable of significantly reducing traffic congestion phenomena when compared with well-reputed competitors.
引用
收藏
页码:141 / 156
页数:16
相关论文
共 62 条
[1]   TraCI4Matlab: Enabling the Integration of the SUMO Road Traffic Simulator and Matlab® Through a Software Re-engineering Process [J].
Acosta, Andres F. ;
Espinosa, Jorge E. ;
Espinosa, Jairo .
MODELING MOBILITY WITH OPEN DATA, 2015, :155-170
[2]  
[Anonymous], 2022, MathWorks Deep Learning Toolbox
[3]   Combined Optimal Routing and Coordination of Connected and Automated Vehicles [J].
Bang, Heeseung ;
Chalaki, Behdad ;
Malikopoulos, Andreas A. .
IEEE CONTROL SYSTEMS LETTERS, 2022, 6 :2749-2754
[4]   Mistuning-Based Control Design to Improve Closed-Loop Stability Margin of Vehicular Platoons [J].
Barooah, Prabir ;
Mehta, Prashant G. ;
Hespanha, Joao P. .
IEEE TRANSACTIONS ON AUTOMATIC CONTROL, 2009, 54 (09) :2100-2113
[5]  
Behrisch M., 2011, P SIMUL
[6]   THE THEORY OF DYNAMIC PROGRAMMING [J].
BELLMAN, R .
BULLETIN OF THE AMERICAN MATHEMATICAL SOCIETY, 1954, 60 (06) :503-515
[7]   Traffic Simulation for All: A Real World Traffic Scenario from the City of Bologna [J].
Bieker, Laura ;
Krajzewicz, Daniel ;
Morra, AntonioPio ;
Michelacci, Carlo ;
Cartolano, Fabio .
MODELING MOBILITY WITH OPEN DATA, 2015, :47-60
[8]  
Blokpoel R., 2010, iTETRIS deliverable 3.2-traffic modelling: ITS algorithms
[9]   Robust fault detection of uncertain linear systems via quasi-LMIs [J].
Casavola, Alessandro ;
Famularo, Domenico ;
Franze, Giuseppe .
AUTOMATICA, 2008, 44 (01) :289-295
[10]   Model-predictive control and reinforcement learning in multi-energy system case studies [J].
Ceusters, Glenn ;
Rodriguez, Roman Cantu ;
Garcia, Alberte Bouso ;
Franke, Rudiger ;
Deconinck, Geert ;
Helsen, Lieve ;
Nowe, Ann ;
Messagie, Maarten ;
Camargo, Luis Ramirez .
APPLIED ENERGY, 2021, 303