Modeling framework of human driving behavior based on Deep Maximum Entropy Inverse Reinforcement Learning

被引：0

作者：

Wang, Yongjie ^{[1
]}

Niu, Yuchen ^{[1
]}

Xiao, Mei ^{[1
]}

Zhu, Wenying ^{[1
]}

You, Xinshang ^{[2
]}

机构：

[1] Changan Univ, Sch Transportat Engn, Xian 710064, Peoples R China

[2] Hebei Univ Sci & Technol, Sch Econ & Management, Shijiazhuang 050018, Peoples R China

来源：

PHYSICA A-STATISTICAL MECHANICS AND ITS APPLICATIONS | 2024年 / 652卷

基金：

中国国家自然科学基金;

关键词：

Human driving behavior; Autonomous vehicle; Occluded pedestrian; Inverse reinforcement learning; Reinforcement learning; PERFORMANCE;

D O I：

10.1016/j.physa.2024.130052

中图分类号：

O4 [物理学];

学科分类号：

0702 ;

摘要：

Driving behavior modeling is extremely crucial for designing safe, intelligent, and personalized autonomous driving systems. In this paper, a modeling framework based on Markov Decision Processes (MDPs) is introduced that emulates drivers' decision-making processes. The framework combines the Deep Maximum Entropy Inverse Reinforcement Learning (Deep MEIRL) and a reinforcement learning algorithm-proximal strategy optimization (PPO). A neural network structure is customized for Deep MEIRL, which uses the velocity of the ego vehicle, the pedestrian position, the velocity of surrounding vehicles, the lateral distance, the surrounding vehicles' type, and the distance to the crosswalk to recover the nonlinear reward function. The dataset of drone-based video footage is collected in Xi'an (China) to train and validate the framework. The outcomes demonstrate that Deep MEIRL-PPO outperforms traditional modeling frameworks (Maximum Entropy Inverse Reinforcement Learning (MEIRL)- PPO) in modeling and predicting human driving behavior. Specifically, in predicting human driving behavior, Deep MEIRL-PPO outperforms MEIRL-PPO by 50.71% and 43.90% on the basis of the MAE and HD, respectively. Furthermore, it is discovered that Deep MEIRL-PPO accurately learns the behavior of human drivers avoiding potential conflicts when lines of sight are occluded. This research can contribute to aiding self-driving vehicles in learning human driving behavior and avoiding unforeseen risks.

引用

页数：14

共 50 条

[21] Parameterized MDPs and Reinforcement Learning Problems--A Maximum Entropy Principle-Based Framework [J].

Srivastava, Amber ;

Salapaka, Srinivasa M. .

IEEE TRANSACTIONS ON CYBERNETICS, 2022, 52 (09) :9339-9351

[22] Deep Reinforcement Learning for Autonomous Driving: A Survey [J].

Kiran, B. Ravi ;

Sobh, Ibrahim ;

Talpaert, Victor ;

Mannion, Patrick ;

Al Sallab, Ahmad A. ;

Yogamani, Senthil ;

Perez, Patrick .

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2022, 23 (06) :4909-4926

[23] Stable Inverse Reinforcement Learning via Leveraged Guided Motion Planner for Driving Behavior Prediction [J].

Zhao, Minglu ;

Shimosaka, Masamichi .

IEEE ACCESS, 2025, 13 :87313-87326

[24] Off-policy asymptotic and adaptive maximum entropy deep reinforcement learning [J].

Zhang, Huihui ;

Han, Xu .

INTERNATIONAL JOURNAL OF MACHINE LEARNING AND CYBERNETICS, 2025, 16 (04) :2417-2429

[25] Inverse Reinforcement Learning Based Stochastic Driver Behavior Learning [J].

Ozkan, Mehmet F. ;

Rocque, Abishek J. ;

Ma, Yao .

IFAC PAPERSONLINE, 2021, 54 (20) :882-888

[26] Driver Behavior Modeling via Inverse Reinforcement Learning Based on Particle Swarm Optimization [J].

Liu, Zeng-Jie ;

Wu, Huai-Ning .

2020 CHINESE AUTOMATION CONGRESS (CAC 2020), 2020, :7232-7237

[27] Learning Implicit Social Navigation Behavior Using Deep Inverse Reinforcement Learning [J].

Kathuria, Tribhi ;

Liu, Ke ;

Jang, Junwoo ;

Yang, X. Jessie ;

Ghaffari, Maani .

IEEE ROBOTICS AND AUTOMATION LETTERS, 2025, 10 (05) :5146-5153

[28] A deep reinforcement learning framework for achieving super-human hazard perception in autonomous driving agents [J].

Bizhe, Navid ;

Nahvi, Ali .

PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART D-JOURNAL OF AUTOMOBILE ENGINEERING, 2024,

[29] A behavior fusion method based on inverse reinforcement learning [J].

Shi, Haobin ;

Li, Jingchen ;

Chen, Shicong ;

Hwang, Kao-Shing .

INFORMATION SCIENCES, 2022, 609 :429-444

[30] Simulation of human-vehicle interaction at right-turn unsignalized intersections: A game-theoretic deep maximum entropy inverse reinforcement learning method [J].

Li, Wenli ;

Li, Xianglong ;

Li, Lingxi ;

Tang, Yuanhang ;

Hu, Yuanzhi .

ACCIDENT ANALYSIS AND PREVENTION, 2025, 214

← 1 2 3 4 5 →