Robust Decision Making for Autonomous Vehicles at Highway On-Ramps: A Constrained Adversarial Reinforcement Learning Approach

被引：59

作者：

He, Xiangkun ^{[1
]}

Lou, Baichuan ^{[1
]}

Yang, Haohan ^{[1
]}

Lv, Chen ^{[1
]}

机构：

[1] Nanyang Technol Univ, Sch Mech & Aerosp Engn, Singapore 639798, Singapore

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2023年 / 24卷 / 04期

基金：

新加坡国家研究基金会;

关键词：

Autonomous vehicles; ramp merging; robust decision making; reinforcement learning; adversarial attack; AUTOMATED VEHICLES; OPTIMIZATION; GAME; GO;

D O I：

10.1109/TITS.2022.3229518

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Reinforcement learning has demonstrated its potential in a series of challenging domains. However, many real-world decision making tasks involve unpredictable environmental changes or unavoidable perception errors that are often enough to mislead an agent into making suboptimal decisions and even cause catastrophic failures. In light of these potential risks, reinforcement learning with application in safety-critical autonomous driving domain remains tricky without ensuring robustness against environmental uncertainties (e.g., road adhesion changes or measurement noises). Therefore, this paper proposes a novel constrained adversarial reinforcement learning approach for robust decision making of autonomous vehicles at highway on-ramps. Environmental disturbance is modelled as an adversarial agent that can learn an optimal adversarial policy to thwart the autonomous driving agent. Meanwhile, observation perturbation is approximated to maximize the variation of the perturbed policy through a white-box adversarial attack technique. Furthermore, a constrained adversarial actor-critic algorithm is presented to optimize an on-ramp merging policy while keeping the variations of the attacked driving policy and action-value function within bounds. Finally, the proposed robust highway on-ramp merging decision making method of autonomous vehicles is evaluated in three stochastic mixed traffic flows with different densities, and its effectiveness is demonstrated in comparison with the competitive baselines.

引用

页码：4103 / 4113

页数：11

共 47 条

[31] A taxonomy and survey of attacks against machine learning [J].

Pitropakis, Nikolaos ;

Panaousis, Emmanouil ;

Giannetsos, Thanassis ;

Anastasiadis, Eleftherios ;

Loukas, George .

COMPUTER SCIENCE REVIEW, 2019, 34

[32] Hierarchical Reinforcement Learning Method for Autonomous Vehicle Behavior Planning [J].

Qiao, Zhiqian ;

Tyree, Zachariah ;

Mudalige, Priyantha ;

Schneider, Jeff ;

Dolan, John M. .

2020 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2020, :6084-6089

[33]

Schulman J., 2017, CoRR abs/1707.06347

[34] Planning and Decision-Making for Autonomous Vehicles [J].

Schwarting, Wilko ;

Alonso-Mora, Javier ;

Rus, Daniela .

ANNUAL REVIEW OF CONTROL, ROBOTICS, AND AUTONOMOUS SYSTEMS, VOL 1, 2018, 1 :187-210

[35] Driving Tasks Transfer Using Deep Reinforcement Learning for Decision-Making of Autonomous Vehicles in Unsignalized Intersection [J].

Shu, Hong ;

Liu, Teng ;

Mu, Xingyu ;

Cao, Dongpu .

IEEE TRANSACTIONS ON VEHICULAR TECHNOLOGY, 2022, 71 (01) :41-52

[36] Mastering the game of Go without human knowledge [J].

Silver, David ;

Schrittwieser, Julian ;

Simonyan, Karen ;

Antonoglou, Ioannis ;

Huang, Aja ;

Guez, Arthur ;

Hubert, Thomas ;

Baker, Lucas ;

Lai, Matthew ;

Bolton, Adrian ;

Chen, Yutian ;

Lillicrap, Timothy ;

Hui, Fan ;

Sifre, Laurent ;

van den Driessche, George ;

Graepel, Thore ;

Hassabis, Demis .

NATURE, 2017, 550 (7676) :354-+

[37] Mastering the game of Go with deep neural networks and tree search [J].

Silver, David ;

Huang, Aja ;

Maddison, Chris J. ;

Guez, Arthur ;

Sifre, Laurent ;

van den Driessche, George ;

Schrittwieser, Julian ;

Antonoglou, Ioannis ;

Panneershelvam, Veda ;

Lanctot, Marc ;

Dieleman, Sander ;

Grewe, Dominik ;

Nham, John ;

Kalchbrenner, Nal ;

Sutskever, Ilya ;

Lillicrap, Timothy ;

Leach, Madeleine ;

Kavukcuoglu, Koray ;

Graepel, Thore ;

Hassabis, Demis .

NATURE, 2016, 529 (7587) :484-+

[38] Neural network vehicle models for high-performance automated driving [J].

Spielberg, Nathan A. ;

Brown, Matthew ;

Kapania, Nitin R. ;

Kegelman, John C. ;

Gerdes, J. Christian .

SCIENCE ROBOTICS, 2019, 4 (28)

[39]

Tessler C., 2019, PR MACH LEARN RES, V97, P6215

[40] Grandmaster level in StarCraft II using multi-agent reinforcement learning [J].

Vinyals, Oriol ;

Babuschkin, Igor ;

Czarnecki, Wojciech M. ;

Mathieu, Michael ;

Dudzik, Andrew ;

Chung, Junyoung ;

Choi, David H. ;

Powell, Richard ;

Ewalds, Timo ;

Georgiev, Petko ;

Oh, Junhyuk ;

Horgan, Dan ;

Kroiss, Manuel ;

Danihelka, Ivo ;

Huang, Aja ;

Sifre, Laurent ;

Cai, Trevor ;

Agapiou, John P. ;

Jaderberg, Max ;

Vezhnevets, Alexander S. ;

Leblond, Remi ;

Pohlen, Tobias ;

Dalibard, Valentin ;

Budden, David ;

Sulsky, Yury ;

Molloy, James ;

Paine, Tom L. ;

Gulcehre, Caglar ;

Wang, Ziyu ;

Pfaff, Tobias ;

Wu, Yuhuai ;

Ring, Roman ;

Yogatama, Dani ;

Wunsch, Dario ;

McKinney, Katrina ;

Smith, Oliver ;

Schaul, Tom ;

Lillicrap, Timothy ;

Kavukcuoglu, Koray ;

Hassabis, Demis ;

Apps, Chris ;

Silver, David .

NATURE, 2019, 575 (7782) :350-+

← 1 2 3 4 5 →