Autonomous Vehicles Roundup Strategy by Reinforcement Learning with Prediction Trajectory

被引:0
作者
Ni, Jiayang [1 ]
Ma, Rubing [1 ]
Zhong, Hua [2 ]
Wang, Bo [1 ]
机构
[1] Beijing Inst Technol, Sch Automat, Beijing 100081, Peoples R China
[2] Beijing Aerosp Control Ctr, Beijing 100094, Peoples R China
来源
2022 41ST CHINESE CONTROL CONFERENCE (CCC) | 2022年
关键词
autonomous vehicle roundup; reinforcement learning; artificial potential field; trajectory prediction; NEURAL-NETWORKS; LEVEL; GAME; GO;
D O I
暂无
中图分类号
TP [自动化技术、计算机技术];
学科分类号
0812 ;
摘要
Autonomous vehicles are increasingly applied on many situations, but their autonomous decision-making ability needs to be improved. Multi-Agent Deep Deterministic Policy Gradient(MADDPG) adopts the method of centralized evaluation and decentralized execution, so that the autonomous vehicle can obtain the whole-field status information and make decisions through the companion information. In the process of autonomous vehicle training, we introduce artificial potential field, action guidance and other methods to alleviate the problem of sparse rewards. At the same time, we add a repulsion function to consider the relationship between team vehicles. Extended Kalman Filter(EKF) is also applied to predict the autonomous vehicle trajectory, changing the training network state input information. At the same time, secondary correction of the predicted autonomous vehicle trajectory is made to change the prediction range with the training time, and improve the training convergence speed while the speed of opposite agents increases. Simulation experiments show that the convergence speed and win rate ofMADDPG algorithm based on trajectory prediction and artificial potential field is significantly improved, and it also has strong adaptability to various task scenarios.
引用
收藏
页码:3370 / 3375
页数:6
相关论文
共 10 条
[1]   Unscented filtering and nonlinear estimation [J].
Julier, SJ ;
Uhlmann, JK .
PROCEEDINGS OF THE IEEE, 2004, 92 (03) :401-422
[2]  
[刘全 Liu Quan], 2018, [计算机学报, Chinese Journal of Computers], V41, P1
[3]   Human-level control through deep reinforcement learning [J].
Mnih, Volodymyr ;
Kavukcuoglu, Koray ;
Silver, David ;
Rusu, Andrei A. ;
Veness, Joel ;
Bellemare, Marc G. ;
Graves, Alex ;
Riedmiller, Martin ;
Fidjeland, Andreas K. ;
Ostrovski, Georg ;
Petersen, Stig ;
Beattie, Charles ;
Sadik, Amir ;
Antonoglou, Ioannis ;
King, Helen ;
Kumaran, Dharshan ;
Wierstra, Daan ;
Legg, Shane ;
Hassabis, Demis .
NATURE, 2015, 518 (7540) :529-533
[4]   Deep learning [J].
Rusk, Nicole .
NATURE METHODS, 2016, 13 (01) :35-35
[5]   Deep learning in neural networks: An overview [J].
Schmidhuber, Juergen .
NEURAL NETWORKS, 2015, 61 :85-117
[6]   Mastering the game of Go without human knowledge [J].
Silver, David ;
Schrittwieser, Julian ;
Simonyan, Karen ;
Antonoglou, Ioannis ;
Huang, Aja ;
Guez, Arthur ;
Hubert, Thomas ;
Baker, Lucas ;
Lai, Matthew ;
Bolton, Adrian ;
Chen, Yutian ;
Lillicrap, Timothy ;
Hui, Fan ;
Sifre, Laurent ;
van den Driessche, George ;
Graepel, Thore ;
Hassabis, Demis .
NATURE, 2017, 550 (7676) :354-+
[7]   Mastering the game of Go with deep neural networks and tree search [J].
Silver, David ;
Huang, Aja ;
Maddison, Chris J. ;
Guez, Arthur ;
Sifre, Laurent ;
van den Driessche, George ;
Schrittwieser, Julian ;
Antonoglou, Ioannis ;
Panneershelvam, Veda ;
Lanctot, Marc ;
Dieleman, Sander ;
Grewe, Dominik ;
Nham, John ;
Kalchbrenner, Nal ;
Sutskever, Ilya ;
Lillicrap, Timothy ;
Leach, Madeleine ;
Kavukcuoglu, Koray ;
Graepel, Thore ;
Hassabis, Demis .
NATURE, 2016, 529 (7587) :484-+
[8]   Grandmaster level in StarCraft II using multi-agent reinforcement learning [J].
Vinyals, Oriol ;
Babuschkin, Igor ;
Czarnecki, Wojciech M. ;
Mathieu, Michael ;
Dudzik, Andrew ;
Chung, Junyoung ;
Choi, David H. ;
Powell, Richard ;
Ewalds, Timo ;
Georgiev, Petko ;
Oh, Junhyuk ;
Horgan, Dan ;
Kroiss, Manuel ;
Danihelka, Ivo ;
Huang, Aja ;
Sifre, Laurent ;
Cai, Trevor ;
Agapiou, John P. ;
Jaderberg, Max ;
Vezhnevets, Alexander S. ;
Leblond, Remi ;
Pohlen, Tobias ;
Dalibard, Valentin ;
Budden, David ;
Sulsky, Yury ;
Molloy, James ;
Paine, Tom L. ;
Gulcehre, Caglar ;
Wang, Ziyu ;
Pfaff, Tobias ;
Wu, Yuhuai ;
Ring, Roman ;
Yogatama, Dani ;
Wunsch, Dario ;
McKinney, Katrina ;
Smith, Oliver ;
Schaul, Tom ;
Lillicrap, Timothy ;
Kavukcuoglu, Koray ;
Hassabis, Demis ;
Apps, Chris ;
Silver, David .
NATURE, 2019, 575 (7782) :350-+
[9]   UAV cooperative air combat maneuver decision based on multi-agent reinforcement learning [J].
Zhang Jiandong ;
Yang Qiming ;
Shi Guoqing ;
Lu Yi ;
Wu Yong .
JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2021, 32 (06) :1421-1438
[10]   Multi-agent system application in accordance with game theory in bi-directional coordination network model [J].
Zhang Jie ;
Wang Gang ;
Yue Shaohua ;
Song Yafei ;
Liu Jiayi ;
Yao Xiaoqiang .
JOURNAL OF SYSTEMS ENGINEERING AND ELECTRONICS, 2020, 31 (02) :279-289