RLfOLD: Reinforcement Learning from Online Demonstrations in Urban Autonomous Driving

被引：0

作者：

Coelho, Daniel ^{[1
,2
]}

Oliveira, Miguel ^{[1
,2
]}

Santos, Vitor ^{[1
,2
]}

机构：

[1] Univ Aveiro, Dept Mech Engn, P-3810193 Aveiro, Portugal

[2] Univ Aveiro, Inst Elect & Informat Engn Aveiro IEETA, Intelligent Syst Associate Lab LASI, P-3810193 Aveiro, Portugal

来源：

THIRTY-EIGHTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE, VOL 38 NO 10 | 2024年

关键词：

FRAMEWORK;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Reinforcement Learning from Demonstrations (RLfD) has emerged as an effective method by fusing expert demonstrations into Reinforcement Learning (RL) training, harnessing the strengths of both Imitation Learning (IL) and RL. However, existing algorithms rely on offline demonstrations, which can introduce a distribution gap between the demonstrations and the actual training environment, limiting their performance. In this paper, we propose a novel approach, Reinforcement Learning from Online Demonstrations (RL-fOLD), that leverages online demonstrations to address this limitation, ensuring the agent learns from relevant and up-to-date scenarios, thus effectively bridging the distribution gap. Unlike conventional policy networks used in typical actorcritic algorithms, RLfOLD introduces a policy network that outputs two standard deviations: one for exploration and the other for IL training. This novel design allows the agent to adapt to varying levels of uncertainty inherent in both RL and IL. Furthermore, we introduce an exploration process guided by an online expert, incorporating an uncertainty-based technique. Our experiments on the CARLA NoCrash benchmark demonstrate the effectiveness and efficiency of RLfOLD. Notably, even with a significantly smaller encoder and a single-camera setup, RLfOLD surpasses state-of-the-art methods in this evaluation. These results, achieved with limited resources, highlight RLfOLD as a highly promising solution for real-world applications.

引用

页码：11660 / 11668

页数：9

共 40 条

[1] Cetin Edoardo, 2022, PR MACH LEARN RES
[2] Chekroun R., 2021, arXiv
[3] Chen, 2019, C ROB LEARN CORL, DOI DOI 10.48550/ARXIV.1912.12294
[4] Learning from All Vehicles
Chen, Dian
Kraehenbuehl, Philipp
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17201 - 17210
[5] Learning to drive from a world on rails
Chen, Dian
Koltun, Vladlen
Krahenbuhl, Philipp
[J]. 2021 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2021), 2021, : 15570 - 15579
[6] Chitta K., 2022, IEEE T PATTERN ANAL
[7] Exploring the Limitations of Behavior Cloning for Autonomous Driving
Codevilla, Felipe
Santana, Eder
Lopez, Antonio M.
Gaidon, Adrien
[J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 9328 - 9337
[8] Codevilla F, 2018, IEEE INT CONF ROBOT, P4693
[9] RLAD: Reinforcement Learning From Pixels for Autonomous Driving in Urban Environments
Coelho, Daniel
Oliveira, Miguel
Santos, Vitor
[J]. IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING, 2024, 21 (04) : 7427 - 7435
[10] A Review of End-to-End Autonomous Driving in Urban Environments
Coelho, Daniel
Oliveira, Miguel
[J]. IEEE ACCESS, 2022, 10 : 75296 - 75311

← 1 2 3 4 →