A Real-World Reinforcement Learning Framework for Safe and Human-Like Tactical Decision-Making

被引：8

作者：

Yavas, Muharrem Ugur ^{[1
,2
]}

Kumbasar, Tufan ^{[3
]}

Ure, Nazim Kemal ^{[4
]}

机构：

[1] Istanbul Tech Univ, Dept Mechatron Engn, TR-34469 Istanbul, Turkiye

[2] Eatron Technol, TR-34485 Istanbul, Turkiye

[3] Istanbul Tech Univ, Dept Control & Automat Engn, TR-34469 Istanbul, Turkiye

[4] Istanbul Tech Univ, Artificial Intelligence & Data Sci Res Ctr, TR-34469 Istanbul, Turkiye

来源：

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS | 2023年 / 24卷 / 11期

关键词：

Autonomous vehicles; reinforcement learning; artificial intelligence; intelligent vehicles; MODEL;

D O I：

10.1109/TITS.2023.3292981

中图分类号：

TU [建筑科学];

学科分类号：

0813 ;

摘要：

Lane-change decision-making for vehicles is a challenging task for many reasons, including traffic rules, safety, and the stochastic nature of driving. Because of its success in solving complex problems, deep reinforcement learning (DRL) has been suggested for addressing these issues. However, the studies on DRL to date have gone no further than validation in simulation and failed to address what are arguably the most critical issues, namely, the mismatch between simulation and reality, human-likeness, and safety. This paper introduces a real-world DRL framework for decision-making to design safe and human-like agents that can operate in the real world without extra tuning. We propose a new learning paradigm for DRL integrated with Real2Sim transfer, which comprises training, validation, and testing phases. The approach involves two simulator environments with different levels of fidelity, which are parameterized via real-world data. Within the framework, a large amount of randomized experience is generated with a low-fidelity simulator, whereupon the learned skills are validated regularly in a high-fidelity simulator to avoid overfitting. Finally, in the testing phase, the agent is examined concerning safety and human-like decision-making. Extensive simulation and real-world evaluations show the superiority of the proposed approach. To the best of the authors' knowledge, this is the first application of DRL lane-changing policy in the real world.

引用

页码：11773 / 11784

页数：12

共 37 条

[1]

Alizadeh A, 2019, IEEE INT C INTELL TR, P1399, DOI [10.1109/itsc.2019.8917192, 10.1109/ITSC.2019.8917192]

[2]

[Anonymous], 2017, NVIDIA PHYSX SDK 3 4

[3]

[Anonymous], 2019, SIMSTAR LATEST FEATU

[4] The Arcade Learning Environment: An Evaluation Platform for General Agents [J].

Bellemare, Marc G. ;

Naddaf, Yavar ;

Veness, Joel ;

Bowling, Michael .

JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, 2013, 47 :253-279

[5]

Bojarski Mariusz, 2016, arXiv

[6]

Busoniu L., 2017, Reinforcement Learning and Dynamic Programming Using Function Approximators

[7]

Dulac-Arnold G, 2019, P 36 INT C MACH LEAR, P1

[8] A MODEL FOR THE STRUCTURE OF LANE-CHANGING DECISIONS [J].

GIPPS, PG .

TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 1986, 20 (05) :403-414

[9]

Gu S., 2017, The 2017 IEEE International Conference on Robotics and Automation (ICRA), DOI 10.1109/ICRA.2017.7989385

[10]

Haak D., 2021, THESIS DELFT U TECHN

← 1 2 3 4 →