Multi-Agent Decision-Making Modes in Uncertain Interactive Traffic Scenarios via Graph Convolution-Based Deep Reinforcement Learning

被引:17
作者
Gao, Xin [1 ]
Li, Xueyuan [1 ]
Liu, Qi [1 ]
Li, Zirui [1 ,2 ]
Yang, Fan [1 ]
Luan, Tian [1 ]
机构
[1] Beijing Inst Technol, Sch Mech Engn, Beijing 100080, Peoples R China
[2] Delft Univ Technol, Fac Civil Engn & Geosci, Dept Transport & Planning, Stevinweg 1, NL-2628 CN Delft, Netherlands
关键词
multi-mode decision-making; connected autonomous vehicles; reward function matrix; uncertain highway exit scene; GQN; MDGQN; PLANNING PROCESS; AUTONOMOUS VANS;
D O I
10.3390/s22124586
中图分类号
O65 [分析化学];
学科分类号
070302 ; 081704 ;
摘要
As one of the main elements of reinforcement learning, the design of the reward function is often not given enough attention when reinforcement learning is used in concrete applications, which leads to unsatisfactory performances. In this study, a reward function matrix is proposed for training various decision-making modes with emphasis on decision-making styles and further emphasis on incentives and punishments. Additionally, we model a traffic scene via graph model to better represent the interaction between vehicles, and adopt the graph convolutional network (GCN) to extract the features of the graph structure to help the connected autonomous vehicles perform decision-making directly. Furthermore, we combine GCN with deep Q-learning and multi-step double deep Q-learning to train four decision-making modes, which are named the graph convolutional deep Q-network (GQN) and the multi-step double graph convolutional deep Q-network (MDGQN). In the simulation, the superiority of the reward function matrix is proved by comparing it with the baseline, and evaluation metrics are proposed to verify the performance differences among decision-making modes. Results show that the trained decision-making modes can satisfy various driving requirements, including task completion rate, safety requirements, comfort level, and completion efficiency, by adjusting the weight values in the reward function matrix. Finally, the decision-making modes trained by MDGQN had better performance in an uncertain highway exit scene than those trained by GQN.
引用
收藏
页数:18
相关论文
共 30 条
[1]   Self-driving cars: A survey [J].
Badue, Claudine ;
Guidolini, Ranik ;
Carneiro, Raphael Vivacqua ;
Azevedo, Pedro ;
Cardoso, Vinicius B. ;
Forechi, Avelino ;
Jesus, Luan ;
Berriel, Rodrigo ;
Paixao, Thiago M. ;
Mutz, Filipe ;
Veronese, Lucas de Paula ;
Oliveira-Santos, Thiago ;
De Souza, Alberto F. .
EXPERT SYSTEMS WITH APPLICATIONS, 2021, 165
[2]  
Bouton M, 2019, IEEE INT C INTELL TR, P3441, DOI [10.1109/ITSC.2019.8916924, 10.1109/itsc.2019.8916924]
[3]   The Planning Process of Transport Tasks for Autonomous Vans-Case Study [J].
Caban, Jacek ;
Nieoczym, Aleksander ;
Dudziak, Agnieszka ;
Krajka, Tomasz ;
Stopkova, Maria .
APPLIED SCIENCES-BASEL, 2022, 12 (06)
[4]   Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age [J].
Cadena, Cesar ;
Carlone, Luca ;
Carrillo, Henry ;
Latif, Yasir ;
Scaramuzza, Davide ;
Neira, Jose ;
Reid, Ian ;
Leonard, John J. .
IEEE TRANSACTIONS ON ROBOTICS, 2016, 32 (06) :1309-1332
[5]   Joint Optimization of Sensing, Decision-Making and Motion-Controlling for Autonomous Vehicles: A Deep Reinforcement Learning Approach [J].
Chen, Longquan ;
He, Ying ;
Wang, Qiang ;
Pan, Weike ;
Ming, Zhong .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (05) :4642-4654
[6]   ES-DQN: A Learning Method for Vehicle Intelligent Speed Control Strategy Under Uncertain Cut-In Scenario [J].
Chen, Qingyun ;
Zhao, Wanzhong ;
Li, Lin ;
Wang, Chunyan ;
Chen, Feng .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (03) :2472-2484
[7]  
Dong J., 2020, ARXIV
[8]   A Decision-Making Strategy for Vehicle Autonomous Braking in Emergency via Deep Reinforcement Learning [J].
Fu, Yuchuan ;
Li, Changle ;
Yu, Fei Richard ;
Luan, Tom H. ;
Zhang, Yao .
IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2020, 69 (06) :5876-5888
[9]  
Jiang J., 2018, ARXIV
[10]  
Khayyam H., 2020, Nonlinear Approaches in Engineering Applications, P39