Pedestrian Graph plus : A Fast Pedestrian Crossing Prediction Model Based on Graph Convolutional Networks

被引:47
作者
Cadena, Pablo Rodrigo Gantier [1 ,2 ,3 ]
Qian, Yeqiang [4 ]
Wang, Chunxiang [1 ,2 ,3 ]
Yang, Ming [1 ,2 ,3 ]
机构
[1] Shanghai Jiao Tong Univ, Dept Automat, Shanghai 200240, Peoples R China
[2] Minist Educ China, Key Lab Syst Control & Informat Proc, Shanghai 200240, Peoples R China
[3] Shanghai Engn Res Ctr Intelligent Control & Manag, Shanghai 200240, Peoples R China
[4] Shanghai Jiao Tong Univ, Univ Michigan Shanghai Jiao Tong Univ Joint Inst, Shanghai 200240, Peoples R China
基金
中国国家自然科学基金;
关键词
Predictive models; Data models; Computational modeling; Pose estimation; Task analysis; Real-time systems; Convolution; Pedestrian crossing prediction; pedestrian behavior; graph convolutional network;
D O I
10.1109/TITS.2022.3173537
中图分类号
TU [建筑科学];
学科分类号
0813 ;
摘要
Estimating when pedestrians cross the street is essential for intelligent transportation systems. Accurate, real-time prediction is critical to ensure the safety of the most vulnerable road users while improving passenger comfort. In the present work, we developed a model called Pedestrian Graph +, an improvement of our previous work, Pedestrian Graph, which predicts pedestrian crossing action in urban areas based on a Graph Convolution Network. We integrated two convolutional modules in the new model that provide additional context information (cropped images, cropped segmentation maps, ego-vehicle velocity data) to the main Graph Convolutional module, thus increasing accuracy. Our model is faster and smaller than other state-of-the-art models, achieving equivalent accuracy. Our model is faster than state-of-the-art models, with an inference time of 6 ms (on a GTX 1080) and low memory consumption (0.3 MB). We tested our model on two datasets, Joint Attention in Autonomous Driving (JAAD) and Pedestrian Intention Estimation (PIE), achieving 86% and 89% accuracy, respectively. Another contribution of our work is the ability to dynamically process almost any input size in the time domain without significant loss of accuracy. It is possible due to the fully convolutional property of ConvNets. Our models and results are available at https://github.com/RodrigoGantier/Pedestrian_graph_plus.
引用
收藏
页码:21050 / 21061
页数:12
相关论文
共 61 条
[1]  
Agarap A.F., 2018, CoRR abs/1803.08375
[2]  
Martinez AA, 2017, INT SYMP COMPUT EDUC
[3]   Tracking without bells and whistles [J].
Bergmann, Philipp ;
Meinhardt, Tim ;
Leal-Taixe, Laura .
2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, :941-951
[4]   Long-Term On-Board Prediction of People in Traffic Scenes under Uncertainty [J].
Bhattacharyya, Apratim ;
Fritz, Mario ;
Schiele, Bernt .
2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :4194-4202
[5]  
Bochkovskiy A, 2020, ARXIV, DOI 10.48550/ARXIV.2004.10934
[6]  
Cadena PRG, 2019, IEEE INT C INTELL TR, P2000, DOI [10.1109/ITSC.2019.8917118, 10.1109/itsc.2019.8917118]
[7]   Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset [J].
Carreira, Joao ;
Zisserman, Andrew .
30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :4724-4733
[8]  
Chaabane M, 2020, IEEE WINT CONF APPL, P2286, DOI [10.1109/wacv45572.2020.9093426, 10.1109/WACV45572.2020.9093426]
[9]  
Chen LB, 2017, IEEE INT SYMP NANO, P1, DOI 10.1109/NANOARCH.2017.8053709
[10]  
Cho K., 2014, ARXIV14061078, DOI [10.48550/arXiv.1406.1078, DOI 10.3115/V1/D14-1179]