Complementary Attention Gated Network for Pedestrian Trajectory Prediction

被引：0

作者：

Duan, Jinghai ^{[1
]}

Wang, Le ^{[2
]}

Long, Chengjiang ^{[3
]}

Zhou, Sanping ^{[2
]}

Zheng, Fang ^{[1
]}

Shi, Liushuai ^{[1
]}

Hua, Gang ^{[4
]}

机构：

[1] Xi An Jiao Tong Univ, Sch Software Engn, Xian, Peoples R China

[2] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Xian, Peoples R China

[3] JD Finance Amer Corp, Mountain View, CA USA

[4] Wormpex AI Res, Bellevue, WA USA

来源：

THIRTY-SIXTH AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FOURTH CONFERENCE ON INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE / THE TWELVETH SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE | 2022年

基金：

中国博士后科学基金; 国家重点研发计划;

关键词：

VISUAL RECOGNITION; CROWDS;

D O I：

暂无

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

Pedestrian trajectory prediction is crucial in many practical applications due to the diversity of pedestrian movements, such as social interactions and individual motion behaviors. With similar observable trajectories and social environments, different pedestrians may make completely different future decisions. However, most existing methods only focus on the frequent modal of the trajectory and thus are difficult to generalize to the peculiar scenario, which leads to the decline of the multimodal fitting ability when facing similar scenarios. In this paper, we propose a complementary attention gated network (CAGN) for pedestrian trajectory prediction, in which a dual-path architecture including normal and inverse attention is proposed to capture both frequent and peculiar modals in spatial and temporal patterns, respectively. Specifically, a complementary block is proposed to guide normal and inverse attention, which are then be summed with learnable weights to get attention features by a gated network. Finally, multiple trajectory distributions are estimated based on the fused spatio-temporal attention features due to the multimodality of future trajectory. Experimental results on benchmark datasets, i.e., the ETH, and the UCY, demonstrate that our method outperforms state-of-the-art methods by 13.8% in Average Displacement Error (ADE) and 10.4% in Final Displacement Error (FDE).

引用

页码：542 / 550

页数：9

共 48 条

[1] Social LSTM: Human Trajectory Prediction in Crowded Spaces [J].

Alahi, Alexandre ;

Goel, Kratarth ;

Ramanathan, Vignesh ;

Robicquet, Alexandre ;

Li Fei-Fei ;

Savarese, Silvio .

2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2016, :961-971

[2] Socially-aware Large-scale Crowd Forecasting [J].

Alahi, Alexandre ;

Ramanathan, Vignesh ;

Li Fei-Fei .

2014 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2014, :2211-2218

[3]

[Anonymous], 2017, CoRR

[4]

Bae I, 2021, AAAI CONF ARTIF INTE, V35, P911

[5]

Bai HY, 2015, IEEE INT CONF ROBOT, P454, DOI 10.1109/ICRA.2015.7139219

[6]

Chen Guangyao, 2021, ICCV

[7]

Chung JY, 2015, PR MACH LEARN RES, V37, P2067

[8] Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction [J].

Yu, Cunjun ;

Ma, Xiao ;

Ren, Jiawei ;

Zhao, Haiyu ;

Yi, Shuai .

COMPUTER VISION - ECCV 2020, PT XII, 2020, 12357 :507-523

[9]

Dang L., 2021, ICCV

[10] Long-Term Recurrent Convolutional Networks for Visual Recognition and Description [J].

Donahue, Jeff ;

Hendricks, Lisa Anne ;

Rohrbach, Marcus ;

Venugopalan, Subhashini ;

Guadarrama, Sergio ;

Saenko, Kate ;

Darrell, Trevor .

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, 2017, 39 (04) :677-691

← 1 2 3 4 5 →