DySeT: A Dynamic Masked Self-distillation Approach for Robust Trajectory Prediction

被引：0

作者：

Pourkeshavarz, Mozhgan ^{[1
]}

Zhang, Junrui ^{[1
]}

Rasouli, Amir ^{[1
]}

机构：

[1] Huawei, Noahs Ark Lab, Montreal, PQ, Canada

来源：

COMPUTER VISION - ECCV 2024, PT III | 2025年 / 15061卷

关键词：

D O I：

10.1007/978-3-031-72646-0_19

中图分类号：

TP18 [人工智能理论];

学科分类号：

081104 ; 0812 ; 0835 ; 1405 ;

摘要：

The lack of generalization capability of behavior prediction models for autonomous vehicles is a crucial concern for safe motion planning. One way to address this is via self-supervised pre-training through masked trajectory prediction. However, the existing models rely on uniform random sampling of tokens, which is sub-optimal because it implies that all components of driving scenes are equally informative. In this paper, to enable more robust representation learning, we introduce a dynamic masked self-distillation approach to identify and utilize informative aspects of the scenes, particularly those corresponding to complex driving behaviors, such as overtaking. Specifically, for targeted sampling, we propose a dynamic method that prioritizes tokens, such as trajectory or lane segments, based on their informativeness. The latter is determined via an auxiliary network that estimates token distributions. Through sampler optimization, more informative tokens are rewarded and selected as visible based on the policy gradient algorithm adopted from reinforcement learning. In addition, we propose a masked self-distillation approach to transfer knowledge from fully visible to masked scene representations. The distillation process not only enriches the semantic information within the visible token set but also progressively refines the sampling process. Further, we use an integrated training regime to enhance the model's ability to learn meaningful representations from informative tokens. Our extensive evaluation on two large-scale trajectory prediction datasets demonstrates the superior performance of the proposed method and its improved prediction robustness across different scenarios.

引用

页码：324 / 342

页数：19

共 84 条

[1] Alexey D, 2020, arXiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
[2] Robust Cross-Modal Representation Learning with Progressive Self-Distillation
Andonian, Alex
Chen, Shixing
Hamid, Raffay
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16409 - 16420
[3] ADAPT: Efficient Multi-Agent Trajectory Prediction with Adaptation
Aydemir, Gorkay
Akan, Adil Kaan
Guney, Fatma
[J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8261 - 8271
[4] Baevski A, 2022, PR MACH LEARN RES
[5] Vehicle trajectory prediction works, but not everywhere
Bahari, Mohammadhossein
Saadatnejad, Saeed
Rahimi, Ahmad
Shaverdikondori, Mohammad
Shahidzadeh, Amir Hossein
Moosavi-Dezfooli, Seyed-Mohsen
Alahi, Alexandre
[J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17102 - 17112
[6] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
Bandara, Wele Gedara Chaminda
Patel, Naman
Gholami, Ali
Nikkhah, Mehdi
Agrawal, Motilal
Patel, Vishal M.
[J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14507 - 14517
[7] Bansal M, 2019, ROBOTICS: SCIENCE AND SYSTEMS XV
[8] STORM-GAN: Spatio-Temporal Meta-GAN for Cross-City Estimation of Human Mobility Responses to COVID-
Bao, Han
Zhou, Xun
Xie, Yiqun
Li, Yanhua
Jia, Xiaowei
[J]. 2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1 - 10
[9] Bhattacharyya P, 2022, PR MACH LEARN RES, V205, P1793
[10] AdvDO: Realistic Adversarial Attacks for Trajectory Prediction
Cao, Yulong
Xiao, Chaowei
Anandkumar, Anima
Xu, Danfei
Pavone, Marco
[J]. COMPUTER VISION - ECCV 2022, PT V, 2022, 13665 : 36 - 52

← 1 2 3 4 5 6 7 8 9 →