Markov-game modeling of cyclist-pedestrian interactions in shared spaces: A multi-agent adversarial inverse reinforcement learning approach

被引:41
作者
Alsaleh, Rushdi [1 ]
Sayed, Tarek [1 ]
机构
[1] Univ British Columbia, Dept Civil Engn, 6250 Appl Sci Lane, Vancouver, BC V6T 1Z4, Canada
关键词
Shared space modeling; Microsimulation; Multi-agent models; Cyclists and pedestrians; Reward functions; FRAMEWORK;
D O I
10.1016/j.trc.2021.103191
中图分类号
U [交通运输];
学科分类号
08 ; 0823 ;
摘要
Understanding and modeling road user dynamics and their microscopic interaction behaviour at shared space facilities are curial for several applications including safety and performance evaluations. Despite the multi-agent nature of road user interactions, the majority of previous studies modeled their interactions as a single-agent modeling framework, i.e., considering the other interaction agents as part of the passive environment. However, this assumption is unrealistic and could limit the model's accuracy and transferability in non-stationary road user environments. This study proposes a novel Multi-Agent Adversarial Inverse Reinforcement Learning approach (MA-AIRL) to model and simulate road user interactions at shared space facilities. Unlike the traditional game-theoretic framework that models multi-agent systems as a single time-step payoff, the proposed approach is based on Markov Games (MG) which models road users' sequential decisions concurrently. Moreover, the proposed model can handle bounded rationality agents, e.g., limited information access, through the implementation of the Logistic Stochastic Best Response Equilibrium (LSBRE) solution concept. The proposed algorithm recovers road users' multi-agent reward functions using adversarial deep neural network discriminators and estimates their optimal policies using Multi-agent Actor-Critic with Kronecker factors (MACK) deep reinforcement learning. Data from three shared space locations in Vancouver, BC and New York City, New York are used in this study. The model's performance is compared to a baseline single-agent Gaussian Process Inverse Reinforcement Learning (GPIRL). The results show that the multi-agent modeling framework led to a significantly more accurate prediction of road users' behaviour and their evasive action mechanisms. Moreover, the recovered reward functions based on the single-agent modeling approach failed to capture the equilibrium solution concept similar to the multi-agent approach.
引用
收藏
页数:23
相关论文
共 64 条
[31]  
Levine S., 2011, Advances in neural information processing systems, P19
[32]  
Levine S., 2012, ARXIV PREPRINT ARXIV
[33]  
Lin X., 2014, ARXIV PREPRINT ARXIV
[34]  
Littman M. L., 1994, Machine learning proceedings 1994, P157
[35]   Modeling and simulation of overtaking events by heterogeneous non -motorized vehicles on shared roadway segments [J].
Liu, Qiyuan ;
Sun, Jian ;
Tian, Ye ;
Xiong, Lu .
SIMULATION MODELLING PRACTICE AND THEORY, 2020, 103
[36]   An Agent-Based Microscopic Pedestrian Flow Simulation Model for Pedestrian Traffic Problems [J].
Liu, Shaobo ;
Lo, Siuming ;
Ma, Jian ;
Wang, Weili .
IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, 2014, 15 (03) :992-1001
[37]  
Lowe R, 2017, ADV NEUR IN, V30
[38]  
Lucas B. D., 1981, IJCAI
[39]   Modeling the interactions between car and bicycle in heterogeneous traffic [J].
Luo, Yongji ;
Jia, Bin ;
Liu, Jun ;
Lam, William H. K. ;
Li, Xingang ;
Gao, Ziyou .
JOURNAL OF ADVANCED TRANSPORTATION, 2015, 49 (01) :29-47
[40]   Modeling cyclist acceleration process for bicycle traffic simulation using naturalistic data [J].
Ma, Xiaoliang ;
Luo, Ding .
TRANSPORTATION RESEARCH PART F-TRAFFIC PSYCHOLOGY AND BEHAVIOUR, 2016, 40 :130-144