DySeT: A Dynamic Masked Self-distillation Approach for Robust Trajectory Prediction

被引:0
作者
Pourkeshavarz, Mozhgan [1 ]
Zhang, Junrui [1 ]
Rasouli, Amir [1 ]
机构
[1] Huawei, Noahs Ark Lab, Montreal, PQ, Canada
来源
COMPUTER VISION - ECCV 2024, PT III | 2025年 / 15061卷
关键词
D O I
10.1007/978-3-031-72646-0_19
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
The lack of generalization capability of behavior prediction models for autonomous vehicles is a crucial concern for safe motion planning. One way to address this is via self-supervised pre-training through masked trajectory prediction. However, the existing models rely on uniform random sampling of tokens, which is sub-optimal because it implies that all components of driving scenes are equally informative. In this paper, to enable more robust representation learning, we introduce a dynamic masked self-distillation approach to identify and utilize informative aspects of the scenes, particularly those corresponding to complex driving behaviors, such as overtaking. Specifically, for targeted sampling, we propose a dynamic method that prioritizes tokens, such as trajectory or lane segments, based on their informativeness. The latter is determined via an auxiliary network that estimates token distributions. Through sampler optimization, more informative tokens are rewarded and selected as visible based on the policy gradient algorithm adopted from reinforcement learning. In addition, we propose a masked self-distillation approach to transfer knowledge from fully visible to masked scene representations. The distillation process not only enriches the semantic information within the visible token set but also progressively refines the sampling process. Further, we use an integrated training regime to enhance the model's ability to learn meaningful representations from informative tokens. Our extensive evaluation on two large-scale trajectory prediction datasets demonstrates the superior performance of the proposed method and its improved prediction robustness across different scenarios.
引用
收藏
页码:324 / 342
页数:19
相关论文
共 84 条
  • [1] Alexey D, 2020, arXiv, DOI [arXiv:2010.11929, DOI 10.48550/ARXIV.2010.11929]
  • [2] Robust Cross-Modal Representation Learning with Progressive Self-Distillation
    Andonian, Alex
    Chen, Shixing
    Hamid, Raffay
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 16409 - 16420
  • [3] ADAPT: Efficient Multi-Agent Trajectory Prediction with Adaptation
    Aydemir, Gorkay
    Akan, Adil Kaan
    Guney, Fatma
    [J]. 2023 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2023), 2023, : 8261 - 8271
  • [4] Baevski A, 2022, PR MACH LEARN RES
  • [5] Vehicle trajectory prediction works, but not everywhere
    Bahari, Mohammadhossein
    Saadatnejad, Saeed
    Rahimi, Ahmad
    Shaverdikondori, Mohammad
    Shahidzadeh, Amir Hossein
    Moosavi-Dezfooli, Seyed-Mohsen
    Alahi, Alexandre
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 17102 - 17112
  • [6] AdaMAE: Adaptive Masking for Efficient Spatiotemporal Learning with Masked Autoencoders
    Bandara, Wele Gedara Chaminda
    Patel, Naman
    Gholami, Ali
    Nikkhah, Mehdi
    Agrawal, Motilal
    Patel, Vishal M.
    [J]. 2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, : 14507 - 14517
  • [7] Bansal M, 2019, ROBOTICS: SCIENCE AND SYSTEMS XV
  • [8] STORM-GAN: Spatio-Temporal Meta-GAN for Cross-City Estimation of Human Mobility Responses to COVID-
    Bao, Han
    Zhou, Xun
    Xie, Yiqun
    Li, Yanhua
    Jia, Xiaowei
    [J]. 2022 IEEE INTERNATIONAL CONFERENCE ON DATA MINING (ICDM), 2022, : 1 - 10
  • [9] Bhattacharyya P, 2022, PR MACH LEARN RES, V205, P1793
  • [10] AdvDO: Realistic Adversarial Attacks for Trajectory Prediction
    Cao, Yulong
    Xiao, Chaowei
    Anandkumar, Anima
    Xu, Danfei
    Pavone, Marco
    [J]. COMPUTER VISION - ECCV 2022, PT V, 2022, 13665 : 36 - 52