3-D Multiple Extended Object Tracking by Fusing Roadside Radar and Camera Sensors

被引:0
作者
Deng, Jiayin [1 ]
Hu, Zhiqun [1 ]
Lu, Zhaoming [1 ]
Wen, Xiangming [1 ]
机构
[1] Beijing Univ Posts & Telecommun, Beijing Lab Adv Informat Networks, Beijing 100876, Peoples R China
关键词
Radar tracking; Trajectory; Three-dimensional displays; Visualization; Sensors; Radar; Probability density function; Radio frequency; Cameras; Symbols; Multiple extended object tracking (MEOT); radar and camera fusion; random finite sets (RFSs); roadside perception;
D O I
10.1109/JSEN.2024.3493952
中图分类号
TM [电工技术]; TN [电子技术、通信技术];
学科分类号
0808 ; 0809 ;
摘要
Multiple object tracking is a key component of intelligent transportation systems. Most existing tracking methods rely on deep learning frameworks, requiring extensive high-quality 3-D annotated datasets, which limits their deployment in real-world scenarios. To address this problem, we present a novel 3-D multiple extended object tracking (MEOT) method by fusing radar and camera data with only pixel annotations. The fusion framework employs a joint detection and tracking (JDT) strategy based on random finite set (RFS) theory, and we propose a multimodal trajectory Poisson multi-Bernoulli (MTPMB) tracker to incorporate multimodal data from radar and camera for the first time. In the update stage, we propose a novel backprojection method and its corresponding visual measurement model based on a mixture density network (MDN) that automatically learns the transformation from pixel coordinates to 3-D road surface, allowing visual measurements to be accurately backprojected to 3-D space. Additionally, we introduce a doped association method that uses visual measurements to assist in splitting radar measurements from nearby objects and establish a correlation between radar and visual measurements. Experimental results demonstrate the effectiveness and superiority of the proposed method compared to several state-of-the-art 3-D MEOT methods without 3-D annotations. Code is available at https://github.com/RadarCameraFusionTeam-BUPT/ES-MEOT.
引用
收藏
页码:1885 / 1899
页数:15
相关论文
共 46 条
[1]  
Baerveldt M., 2023, P 26 INT C INF FUS F, P1
[2]   Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics [J].
Bernardin, Keni ;
Stiefelhagen, Rainer .
EURASIP JOURNAL ON IMAGE AND VIDEO PROCESSING, 2008, 2008 (1)
[3]  
Bishop C., 1994, MIXTURE DENSITY NETW
[4]  
Bishop Christopher M., 2006, Pattern recognition and machine learning
[5]   nuScenes: A multimodal dataset for autonomous driving [J].
Caesar, Holger ;
Bankiti, Varun ;
Lang, Alex H. ;
Vora, Sourabh ;
Liong, Venice Erin ;
Xu, Qiang ;
Krishnan, Anush ;
Pan, Yu ;
Baldan, Giancarlo ;
Beijbom, Oscar .
2020 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR 2020), 2020, :11618-11628
[6]  
Cao H. Zhang, 2024, Neurocomputing, V585
[7]   VoxelNeXt: Fully Sparse VoxelNet for 3D Object Detection and Tracking [J].
Chen, Yukang ;
Liu, Jianhui ;
Zhang, Xiangyu ;
Qi, Xiaojuan ;
Jia, Jiaya .
2023 IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), 2023, :21674-21683
[8]   A9-Dataset: Multi-Sensor Infrastructure-Based Dataset for Mobility Research [J].
Cress, Christian ;
Zimmer, Walter ;
Strand, Leah ;
Fortkord, Maximilian ;
Dai, Siyi ;
Lakshminarasimhan, Venkatnarayanan ;
Knoll, Alois .
2022 IEEE INTELLIGENT VEHICLES SYMPOSIUM (IV), 2022, :965-970
[9]   On Implementing 2D Rectangular Assignment Algorithms [J].
Crouse, David F. .
IEEE TRANSACTIONS ON AEROSPACE AND ELECTRONIC SYSTEMS, 2016, 52 (04) :1679-1696
[10]  
Deng JY, 2024, Arxiv, DOI arXiv:2404.17903