Offline Reinforcement Learning for Autonomous Driving with Real World Driving Data

被引：13

作者：

Fang, Xing ^{[1
,3
]}

Zhang, Qichao ^{[1
,2
,4
]}

Gao, Yinfeng ^{[1
,5
]}

Zhao, Dongbin ^{[1
,2
,4
]}

机构：

[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China

[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China

[3] Univ Elect Sci & Technol China, Sch Math Sci, Chengdu 611731, Peoples R China

[4] Peng Cheng Lab, Shenzhen, Peoples R China

[5] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China

来源：

2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC) | 2022年

基金：

中国国家自然科学基金;

关键词：

D O I：

10.1109/ITSC55140.2022.9922100

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Since traditional reinforcement learning (RL) approaches need active online interaction with the environment, previous works are mainly investigated in the simulation environment rather than the real world environment, especially for safety-critical applications. Offline RL has recently emerged as a promising data-driven learning paradigm to learn a policy from offline dataset directly. It seems that offline RL is well suited for autonomous driving, as it is feasible to collect offline naturalized driving dataset. However, it remains unclear how to deploy offline RL with real world driving dataset only including observation data, and whether current offline RL algorithms work well to learn a driving policy than imitation learning? In this paper, we provide an offline RL benchmark for autonomous driving including the dataset, baselines, and a data driven simulator1. First, we summarize and introduce the popular offline RL baseline methods. Then, we construct an offline RL dataset for the car following task based on the real world driving dataset INTERACTION. A data driven simulator is applied to obtain augmented data and test the driving policy. Further, we deploy four popular offline algorithms and analyze their performances under different datasets including real world driving data and augmented data. Finally, related conclusions and discussions are given to analyze the critical challenge for offline RL in autonomous driving.

引用

页码：3417 / 3422

页数：6

共 32 条

[1] [Anonymous], 2019, INT C MACH LEARN
[2] Batarseh FA, 2020, DATA DEMOCRACY: AT THE NEXUS OF ARTIFICIAL INTELLIGENCE, SOFTWARE DEVELOPMENT, AND KNOWLEDGE ENGINEERING, P179, DOI 10.1016/B978-0-12-818366-3.00010-1
[3] Argoverse: 3D Tracking and Forecasting with Rich Maps
Chang, Ming-Fang
Lambert, John
Sangkloy, Patsorn
Singh, Jagjeet
Bak, Slawomir
Hartnett, Andrew
Wang, De
Carr, Peter
Lucey, Simon
Ramanan, Deva
Hays, James
[J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8740 - 8749
[4] Magnetic control of tokamak plasmas through deep reinforcement learning
Degrave, Jonas
Felici, Federico
Buchli, Jonas
Neunert, Michael
Tracey, Brendan
Carpanese, Francesco
Ewalds, Timo
Hafner, Roland
Abdolmaleki, Abbas
de las Casas, Diego
Donner, Craig
Fritz, Leslie
Galperti, Cristian
Huber, Andrea
Keeling, James
Tsimpoukelli, Maria
Kay, Jackie
Merle, Antoine
Moret, Jean-Marc
Noury, Seb
Pesamosca, Federico
Pfau, David
Sauter, Olivier
Sommariva, Cristian
Coda, Stefano
Duval, Basil
Fasoli, Ambrogio
Kohli, Pushmeet
Kavukcuoglu, Koray
Hassabis, Demis
Riedmiller, Martin
[J]. NATURE, 2022, 602 (7897) : 414 - +
[5] Dosovitskiy Alexey, 2017, PMLR, P1
[6] Fu J., 2020, D4RL DATASETS DEEP D
[7] Fujimoto S, 2018, PR MACH LEARN RES, V80
[8] Fujimoto Scott, 2021, Advances in Neural Information Processing Systems
[9] Gulcehre C., 2020, Advances in Neural Information Processing Systems
[10] Kidambi Rahul, 2020, ADV NEURAL INFORM PR, V33, P21810

← 1 2 3 4 →