Offline Reinforcement Learning for Autonomous Driving with Real World Driving Data

被引:13
作者
Fang, Xing [1 ,3 ]
Zhang, Qichao [1 ,2 ,4 ]
Gao, Yinfeng [1 ,5 ]
Zhao, Dongbin [1 ,2 ,4 ]
机构
[1] Chinese Acad Sci, Inst Automat, State Key Lab Management & Control Complex Syst, Beijing 100190, Peoples R China
[2] Univ Chinese Acad Sci, Sch Artificial Intelligence, Beijing 100049, Peoples R China
[3] Univ Elect Sci & Technol China, Sch Math Sci, Chengdu 611731, Peoples R China
[4] Peng Cheng Lab, Shenzhen, Peoples R China
[5] Univ Sci & Technol Beijing, Sch Automat & Elect Engn, Beijing 100083, Peoples R China
来源
2022 IEEE 25TH INTERNATIONAL CONFERENCE ON INTELLIGENT TRANSPORTATION SYSTEMS (ITSC) | 2022年
基金
中国国家自然科学基金;
关键词
D O I
10.1109/ITSC55140.2022.9922100
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Since traditional reinforcement learning (RL) approaches need active online interaction with the environment, previous works are mainly investigated in the simulation environment rather than the real world environment, especially for safety-critical applications. Offline RL has recently emerged as a promising data-driven learning paradigm to learn a policy from offline dataset directly. It seems that offline RL is well suited for autonomous driving, as it is feasible to collect offline naturalized driving dataset. However, it remains unclear how to deploy offline RL with real world driving dataset only including observation data, and whether current offline RL algorithms work well to learn a driving policy than imitation learning? In this paper, we provide an offline RL benchmark for autonomous driving including the dataset, baselines, and a data driven simulator1. First, we summarize and introduce the popular offline RL baseline methods. Then, we construct an offline RL dataset for the car following task based on the real world driving dataset INTERACTION. A data driven simulator is applied to obtain augmented data and test the driving policy. Further, we deploy four popular offline algorithms and analyze their performances under different datasets including real world driving data and augmented data. Finally, related conclusions and discussions are given to analyze the critical challenge for offline RL in autonomous driving.
引用
收藏
页码:3417 / 3422
页数:6
相关论文
共 32 条
  • [1] [Anonymous], 2019, INT C MACH LEARN
  • [2] Batarseh FA, 2020, DATA DEMOCRACY: AT THE NEXUS OF ARTIFICIAL INTELLIGENCE, SOFTWARE DEVELOPMENT, AND KNOWLEDGE ENGINEERING, P179, DOI 10.1016/B978-0-12-818366-3.00010-1
  • [3] Argoverse: 3D Tracking and Forecasting with Rich Maps
    Chang, Ming-Fang
    Lambert, John
    Sangkloy, Patsorn
    Singh, Jagjeet
    Bak, Slawomir
    Hartnett, Andrew
    Wang, De
    Carr, Peter
    Lucey, Simon
    Ramanan, Deva
    Hays, James
    [J]. 2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, : 8740 - 8749
  • [4] Magnetic control of tokamak plasmas through deep reinforcement learning
    Degrave, Jonas
    Felici, Federico
    Buchli, Jonas
    Neunert, Michael
    Tracey, Brendan
    Carpanese, Francesco
    Ewalds, Timo
    Hafner, Roland
    Abdolmaleki, Abbas
    de las Casas, Diego
    Donner, Craig
    Fritz, Leslie
    Galperti, Cristian
    Huber, Andrea
    Keeling, James
    Tsimpoukelli, Maria
    Kay, Jackie
    Merle, Antoine
    Moret, Jean-Marc
    Noury, Seb
    Pesamosca, Federico
    Pfau, David
    Sauter, Olivier
    Sommariva, Cristian
    Coda, Stefano
    Duval, Basil
    Fasoli, Ambrogio
    Kohli, Pushmeet
    Kavukcuoglu, Koray
    Hassabis, Demis
    Riedmiller, Martin
    [J]. NATURE, 2022, 602 (7897) : 414 - +
  • [5] Dosovitskiy Alexey, 2017, PMLR, P1
  • [6] Fu J., 2020, D4RL DATASETS DEEP D
  • [7] Fujimoto S, 2018, PR MACH LEARN RES, V80
  • [8] Fujimoto Scott, 2021, Advances in Neural Information Processing Systems
  • [9] Gulcehre C., 2020, Advances in Neural Information Processing Systems
  • [10] Kidambi Rahul, 2020, ADV NEURAL INFORM PR, V33, P21810