Global Grouped Coordinate Attention for Transformer in Pedestrian Trajectory Prediction

被引：0

作者：

Xiao, Jiashuai ^{[1
]}

Zhang, Jing ^{[2
]}

Li, Zilong ^{[1
]}

机构：

[1] Tiangong Univ, Sch Mech Engn, 399 Binshui West Rd, Tianjin 300387, Peoples R China

[2] TIANGONG Univ, Sch Comp Sci & Technol, 399,Binshui West Rd, Tianjin 300387, Peoples R China

来源：

SIGNAL IMAGE AND VIDEO PROCESSING | 2025年 / 19卷 / 09期

关键词：

Deep Learning; Pedestrian Behavior Analysis; Pedestrian Trajectory Prediction; Computer Vision; Pedestrian Detection;

D O I：

10.1007/s11760-025-04299-x

中图分类号：

TM [电工技术]; TN [电子技术、通信技术];

学科分类号：

0808 ; 0809 ;

摘要：

Pedestrian trajectory prediction stands as a critical research domain within the field of computer vision, offering substantial application potential in areas such as intelligent transportation systems and video surveillance. Nevertheless, the intricate and unpredictable nature of pedestrian movements renders this prediction task notably challenging. In an effort to enhance the precision of pedestrian trajectory forecasts, this study introduces an innovative approach rooted in model fusion. We propose a novel Global Grouped Coordinate Attention (GGCA) module, which significantly enhances the Transformer model's ability to model spatial dependencies by combining channel grouping operations with bidirectional global pooling. Specifically, we first selected a classic pedestrian trajectory prediction network model as the basic architecture, and then seamlessly integrated the GGCA module into this model to optimize feature representation and generate the final prediction results. To ascertain the efficacy of the method proposed in this thesis, we conducted a series of experiments on the ETH-UCY and SDD datasets. The experimental findings indicate that the hybrid model we have developed achieves a marked improvement in prediction accuracy over the original model. Particularly in the case of the SDD dataset, our fusion model brought about a substantial reduction in the mean prediction error. Concretely, the Average Displacement Error (ADE) was decreased from 7.80 to 7.74, while the Final Displacement Error (FDE) was lowered from 12.89 to 12.65. Such notable improvements incontrovertibly prove the efficacy of our approach. Consequently, the pedestrian trajectory prediction strategy based on model fusion presented in this paper offers clear benefits in terms of prediction enhancement and provides valuable insights for advancements in the field of pedestrian trajectory prediction.

引用

页数：9

共 25 条

[1] Social LSTM: Human Trajectory Prediction in Crowded Spaces [J].

Alahi, Alexandre ;

Goel, Kratarth ;

Ramanathan, Vignesh ;

Robicquet, Alexandre ;

Li Fei-Fei ;

Savarese, Silvio .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :961-971

[2]

Carion N, 2020, Img Proc Comp Vis Re, V12346, P213, DOI 10.1007/978-3-030-58452-8_13

[3]

Dosovitskiy A, 2021, Arxiv, DOI [arXiv:2010.11929, 10.48550/arXiv.2010.11929]

[4]

Duan JH, 2022, AAAI CONF ARTIF INTE, P542

[5] Understanding the Basis of the Kalman Filter Via a Simple and Intuitive Derivation [J].

Faragher, Ramsey .

IEEE SIGNAL PROCESSING MAGAZINE, 2012, 29 (05) :128-132

[6]

Glorot X., 2011, P 14 INT C ART INT S, P315

[7]

Guo SN, 2019, AAAI CONF ARTIF INTE, P922

[8] Social GAN: Socially Acceptable Trajectories with Generative Adversarial Networks [J].

Gupta, Agrim ;

Johnson, Justin ;

Li Fei-Fei ;

Savarese, Silvio ;

Alahi, Alexandre .

2018 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2018, :2255-2264

[9] SOCIAL FORCE MODEL FOR PEDESTRIAN DYNAMICS [J].

HELBING, D ;

MOLNAR, P .

PHYSICAL REVIEW E, 1995, 51 (05) :4282-4286

[10] Deep Roots: Improving CNN Efficiency with Hierarchical Filter Groups [J].

Ioannou, Yani ;

Robertson, Duncan ;

Cipolla, Roberto ;

Criminisi, Antonio .

30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, :5977-5986

← 1 2 3 →