Temporal Pyramid Network With Spatial-Temporal Attention for Pedestrian Trajectory Prediction

被引:39
作者
Li, Yuanman [1 ]
Liang, Rongqin [1 ]
Wei, Wei [2 ]
Wang, Wei [3 ]
Zhou, Jiantao [4 ]
Li, Xia [1 ]
机构
[1] Shenzhen Univ, Coll Elect & Informat Engn, Guangdong Key Lab Intelligent Informat Proc, Shenzhen 518060, Guangdong, Peoples R China
[2] Xian Univ Technol, Sch Comp Sci & Engn, Xian 710600, Shanxi, Peoples R China
[3] Sun Yat Sen Univ, Sch Intelligent Syst Engn, Guangzhou 518107, Guangdong, Peoples R China
[4] Univ Macau, State Key Lab Internet Things Smart City, Taipa 999078, Macao, Peoples R China
来源
IEEE TRANSACTIONS ON NETWORK SCIENCE AND ENGINEERING | 2022年 / 9卷 / 03期
关键词
Trajectory; Prediction algorithms; Feature extraction; Predictive models; Computational modeling; Task analysis; Modulation; Deep learning; social behavior; social computing; social interactions; spatial-temporal attention; temporal pyramid network; trajectory prediction; MODELS;
D O I
10.1109/TNSE.2021.3065019
中图分类号
T [工业技术];
学科分类号
08 ;
摘要
Understanding and predicting human motion behavior with social interactions have become an increasingly crucial problem for a vast number of applications, ranging from visual navigation of autonomous vehicles to activity prediction of intelligent video surveillance. Accurately forecasting crowd motion behavior is challenging due to the multimodal nature of trajectories and complex social interactions between humans. Recent algorithms model and predict the trajectory with a single resolution, making them difficult to exploit the long-range information and the short-range information of the motion behavior simultaneously. In this paper, we propose a temporal pyramid network for pedestrian trajectory prediction through a squeeze modulation and a dilation modulation. The hierarchical design of our framework allows to model the trajectory with multi-resolution, then can better capture the motion behavior at various tempos. By progressively combining the global context with the local one, we finally construct a coarse-to-fine hierarchical pedestrian trajectory prediction framework with multi-supervision. Further, we introduce a unified spatial-temporal attention mechanism to adaptively select important information of persons around in both spatial and temporal domains. We show that our attention strategy is intuitive and effective to encode the influence of social interactions. Experimental results on two benchmarks demonstrate the superiority of our proposed scheme.
引用
收藏
页码:1006 / 1019
页数:14
相关论文
共 60 条
[1]  
Al-Molegi Abdulrahman., 2016, P 2016 IEEE S SERIES, P1
[2]   Social LSTM: Human Trajectory Prediction in Crowded Spaces [J].
Alahi, Alexandre ;
Goel, Kratarth ;
Ramanathan, Vignesh ;
Robicquet, Alexandre ;
Li Fei-Fei ;
Savarese, Silvio .
2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :961-971
[3]   Socially-aware Large-scale Crowd Forecasting [J].
Alahi, Alexandre ;
Ramanathan, Vignesh ;
Li Fei-Fei .
2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2211-2218
[4]   Social Ways: Learning Multi-Modal Distributions of Pedestrian Trajectories with GANs [J].
Amirian, Javad ;
Hayet, Jean-Bernard ;
Pettre, Julien .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION WORKSHOPS (CVPRW 2019), 2019, :2964-2972
[5]   Discrete choice models of pedestrian walking behavior [J].
Antonini, Gianluca ;
Bierlaire, Michel ;
Weber, Mats .
TRANSPORTATION RESEARCH PART B-METHODOLOGICAL, 2006, 40 (08) :667-687
[6]  
Bahdanau D., 2015, ICLR
[7]  
Bahdanau D, 2016, Arxiv, DOI arXiv:1409.0473
[8]   Online Nonparametric Bayesian Activity Mining and Analysis From Surveillance Video [J].
Bastani, Vahid ;
Marcenaro, Lucio ;
Regazzoni, Carlo S. .
IEEE TRANSACTIONS ON IMAGE PROCESSING, 2016, 25 (05) :2089-2102
[9]  
Chung J., 2015, Neural Information Processing Systems, V28, P2980
[10]   Histograms of oriented gradients for human detection [J].
Dalal, N ;
Triggs, B .
2005 IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION, VOL 1, PROCEEDINGS, 2005, :886-893