Data-efficient Deep Reinforcement Learning for Vehicle Trajectory Control

被引：1

作者：

Frauenknecht, Bernd ^{[1
]}

Ehlgen, Tobias ^{[2
]}

Trimpe, Sebastian ^{[1
]}

机构：

[1] Rhein Westfal TH Aachen, Inst Data Sci Mech Engn DSME, D-52068 Aachen, Germany

[2] ZF Friedrichshafen AG, Ai Lab Friedrichshafen, D-88045 Friedrichshafen, Germany

来源：

2023 IEEE 26TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS, ITSC | 2023年

关键词：

STABILITY;

D O I：

10.1109/ITSC57777.2023.10422451

中图分类号：

TP [自动化技术、计算机技术];

学科分类号：

0812 ;

摘要：

Advanced vehicle control is a fundamental building block in the development of autonomous driving systems. Reinforcement learning (RL) promises to achieve control performance superior to classical approaches while keeping computational demands low during deployment. However, standard RL approaches like soft-actor critic (SAC) require extensive amounts of training data to be collected and are thus impractical for real-world application. To address this issue, we apply recently developed data-efficient deep RL methods to vehicle trajectory control. Our investigation focuses on three methods, so far unexplored for vehicle control: randomized ensemble double Q-learning (REDQ), probabilistic ensembles with trajectory sampling and model predictive path integral optimizer (PETS-MPPI), and model-based policy optimization (MBPO). We find that in the case of trajectory control, the standard model-based RL formulation used in approaches like PETS-MPPI and MBPO is not suitable. We, therefore, propose a new formulation that splits dynamics prediction and vehicle localization. Our benchmark study on the CARLA simulator reveals that the three identified data-efficient deep RL approaches learn control strategies on a par with or better than SAC, yet reduce the required number of environment interactions by more than one order of magnitude.

引用

页码：894 / 901

页数：8

共 61 条

[1]

Akkaya I., 2019, CoRR

[2]

Anschel Oron, 2017, P MACHINE LEARNING R, V70

[3]

Betz J., 2022, arXiv

[4]

Botev Z. I, 2013, Handbook Of Statist., V31

[5]

Brockman Greg, 2016, arXiv

[6]

Buckman J, 2018, ADV NEUR IN, V31

[7] High-Speed Autonomous Drifting With Deep Reinforcement Learning [J].

Cai, Peide ;

Mei, Xiaodong ;

Tai, Lei ;

Sun, Yuxiang ;

Liu, Ming .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2020, 5 (02) :1247-1254

[8]

Chen J., 2019, INT C INT ROB SYS

[9]

Chen X, 2021, INT C LEARN REP

[10] Quantifying autonomous vehicles national fuel consumption impacts: A data-rich approach [J].

Chen, Yuche ;

Gonder, Jeffrey ;

Young, Stanley ;

Wood, Eric .

TRANSPORTATION RESEARCH PART A-POLICY AND PRACTICE, 2019, 122 :134-145

← 1 2 3 4 5 6 7 →