TEMI-MOT: Towards Efficient Multi-Modality Instance-Aware Feature Learning for 3D Multi-Object Tracking

被引:0
作者
Hu, Yufeng [1 ]
Zhou, Sanping [1 ]
Dong, Jinpeng [1 ]
Zheng, Nanning [1 ]
机构
[1] Xi An Jiao Tong Univ, Inst Artificial Intelligence & Robot, Natl Engn Res Ctr Visual Informat & Applicat, Natl Key Lab Human Machine Hybrid Augmented Intel, Xian, Peoples R China
来源
2023 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORKS, IJCNN | 2023年
基金
美国国家科学基金会;
关键词
multi-object tracking; multi-modality fusion; autonomous driving;
D O I
10.1109/IJCNN54540.2023.10191718
中图分类号
TP18 [人工智能理论];
学科分类号
081104 ; 0812 ; 0835 ; 1405 ;
摘要
3D multi-object tracking is one of the key technologies of autonomous driving, which aims to ensure that autonomous driving vehicles accurately perceive the movements and intentions of surrounding traffic participants. In recent years, some 3D multi-object tracking methods based on multi-modality have been proposed. Although these methods improve the accuracy of object association in the tracking process, these methods are still difficult to effectively deal with the problems of feature ambiguity due to occlusion, incorrect feature alignment between different modalities, and confusion of adjacent target features caused by coarse-grained feature maps. To address these problems, we propose a new multi-modality feature learning method for 3D multi-object tracking, named TEMI-MOT, which is composed of three modules in series: the point-guided image feature sampler, the instance-aware feature encoder, and the tracking pipeline. The point-guided feature sampler realizes the alignment between the point cloud and image features, the instance-aware feature encoder fuses the aligned image features with each object's points to generate the discriminative instance-aware features, and the tracking pipeline finally outputs the results based on instance-aware features and G-IoU geometric similarities. Our approach achieves state-of-the-art results on the nuScenes dataset among the methods using CenterPoint detections. The experimental results show that the proposed method has better robustness and effectiveness for 3D multi-object tracking.
引用
收藏
页数:9
相关论文
共 42 条
  • [1] [Anonymous], 2020, EUR C COMP VIS SPRIN
  • [2] TransFusion: Robust LiDAR-Camera Fusion for 3D Object Detection with Transformers
    Bai, Xuyang
    Hu, Zeyu
    Zhu, Xinge
    Huang, Qingqiu
    Chen, Yilun
    Fu, Hangbo
    Tai, Chiew-Lan
    [J]. 2022 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2022), 2022, : 1080 - 1089
  • [3] Score refinement for confidence-based 3D multi-object tracking
    Benbarka, Nuri
    Schroder, Jona
    Zell, Andreas
    [J]. 2021 IEEE/RSJ INTERNATIONAL CONFERENCE ON INTELLIGENT ROBOTS AND SYSTEMS (IROS), 2021, : 8083 - 8090
  • [4] Tracking without bells and whistles
    Bergmann, Philipp
    Meinhardt, Tim
    Leal-Taixe, Laura
    [J]. 2019 IEEE/CVF INTERNATIONAL CONFERENCE ON COMPUTER VISION (ICCV 2019), 2019, : 941 - 951
  • [5] Caesar H, 2020, PROC CVPR IEEE, P11618, DOI 10.1109/CVPR42600.2020.01164
  • [6] Chen Z., 2022, ARXIV220710316
  • [7] Probabilistic 3D Multi-Modal, Multi-Object Tracking for Autonomous Driving
    Chiu, Hsu-kuang
    Lie, Jie
    Ambrus, Rares
    Bohg, Jeannette
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 14227 - 14233
  • [8] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis
    Dai, Angela
    Qi, Charles Ruizhongtai
    Niessner, Matthias
    [J]. 30TH IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2017), 2017, : 6545 - 6554
  • [9] Hu H.-N., 2022, IEEE Transactions on Pattern Analysis and Machine Intelligence
  • [10] EagerMOT: 3D Multi-Object Tracking via Sensor Fusion
    Kim, Aleksandr
    Osep, Aljosa
    Leal-Taixe, Laura
    [J]. 2021 IEEE INTERNATIONAL CONFERENCE ON ROBOTICS AND AUTOMATION (ICRA 2021), 2021, : 11315 - 11321