Hierarchical Latent Structure for Multi-modal Vehicle Trajectory Forecasting

被引:16
作者
Choi, Dooseop [1 ]
Min, KyoungWook [1 ]
机构
[1] ETRI, Artificial Intelligence Res Lab, Daejeon, South Korea
来源
COMPUTER VISION, ECCV 2022, PT XXII | 2022年 / 13682卷
关键词
D O I
10.1007/978-3-031-20047-2_8
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
Variational autoencoder (VAE) has widely been utilized for modeling data distributions because it is theoretically elegant, easy to train, and has nice manifold representations. However, when applied to image reconstruction and synthesis tasks, VAE shows the limitation that the generated sample tends to be blurry. We observe that a similar problem, in which the generated trajectory is located between adjacent lanes, often arises in VAE-based trajectory forecasting models. To mitigate this problem, we introduce a hierarchical latent structure into the VAE-based forecasting model. Based on the assumption that the trajectory distribution can be approximated as a mixture of simple distributions (or modes), the low-level latent variable is employed to model each mode of the mixture and the high-level latent variable is employed to represent the weights for the modes. To model each mode accurately, we condition the low-level latent variable using two lane-level context vectors computed in novel ways, one corresponds to vehicle-lane interaction and the other to vehicle-vehicle interaction. The context vectors are also used to model the weights via the proposed mode selection network. To evaluate our forecasting model, we use two large-scale real-world datasets. Experimental results show that our model is not only capable of generating clear multi-modal trajectory distributions but also outperforms the state-of-the-art (SOTA) models in terms of prediction accuracy. Our code is available at https://github.com/d1024choi/HLSTrajForecast.
引用
收藏
页码:129 / 145
页数:17
相关论文
共 36 条
[1]  
Bahdanau D., 2015, P 3 INT C LEARN REPR, P1
[2]  
Bhattacharyya Apratim, 2018, IEEE C COMP VIS PATT
[3]  
Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164
[4]  
Casas Sergio, 2020, Computer Vision - ECCV 2020. 16th European Conference. Proceedings. Lecture Notes in Computer Science (LNCS 12368), P624, DOI 10.1007/978-3-030-58592-1_37
[5]  
Casas S, 2020, INT C INTELLIGENT RO
[6]   Argoverse: 3D Tracking and Forecasting with Rich Maps [J].
Chang, Ming-Fang ;
Lambert, John ;
Sangkloy, Patsorn ;
Singh, Jagjeet ;
Bak, Slawomir ;
Hartnett, Andrew ;
Wang, De ;
Carr, Peter ;
Lucey, Simon ;
Ramanan, Deva ;
Hays, James .
2019 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2019), 2019, :8740-8749
[7]  
Choi W., 2017, IEEE C COMPUTER VISI
[8]  
Cui A, 2021, INT C COMPUTER VISIO
[9]  
Cui HG, 2019, IEEE INT CONF ROBOT, P2090, DOI [10.1109/icra.2019.8793868, 10.1109/ICRA.2019.8793868]
[10]   TPNet: Trajectory Proposal Network for Motion Prediction [J].
Fang, Liangji ;
Jiang, Qinhong ;
Shi, Jianping ;
Zhou, Bolei .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2020, :6796-6805